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TABLE 52. Mutagenic Cassette: N, A, N 



CODON 




AMINO acid (Frcqaency) 


CATEGORY (FreqacMy) 






GLYCINE 0 


NONPOIAR 0 






ALANINE 0 


(NPL) 






VALINE 0 








LEUCINE 0 








ISOLEUCINE 0 






METHIONINE 0 






PHENYLALAMKE 0 






TRYPTOPHAN 0 






PROLINE 0 






SERINE 0 


POLAR 6 
NONIONIZABLE 
(POL) 






CYSTEINE 0 


AAT 


YES 


ASPARAGINE 2 


AAt 


YES 


CAA 


YES 


GLUIAMINE 2 


CAG 


YES 


TAT 


YES 


TYROSINE 2 


VaC 


YfiS 






THREONINE 0 


GAT 


YES 


ASPARTICACIO 2 


lONIZABLE: ACIDIC 4 
NEGATIVE CHARGE 
{NEG> 


GAC 


YES 


GAA 


YES 


GLUTAMIC ADD 2 


GAG 


Yu 


AAA 


YES 


LYSINE 2 


lONIZABLE: BASIC 4 
POSITIVE CHARGE 
(POS) 


AAC 


YES 






ARGININE 0 


CAT 


YES 


HJSTIDINE 2 




Ves 


TAA 


YES 


STOP CODON 2 


STOP SIGNAL 2 
(STP) 


TAG 


YES 




16 


7 An loo Addf Art BcpnmUcd 


NPL: POL: NEC: POS: STP- 
(h 6: 4: 4; 2 


tBtcnlc Cmcite: 


CN 






CODUn 


Acpmented 


AMINO ACID (Frcqaency) 


CATEGORY (Frtqaauy) 






GLYONB 0 


NONPOLAR 8 
(NPL) 


GCT 


YES 


ALANINE 4 


GCC 


YES 


GCA 


YES 


GCG 


YES 






VALINE 0 






LEUCINE 0 






ISOLEUCINE 0 






METHIONINE 0 






PHENYLALANINE 0 






TRYPTOPHAN 0 


CCT 


YES 


PROLINE 4 


ccc 


YES 


CCA 


YES 


CCG 


YES 


Tcr 


YES 


CCDIXtC A 


POLAR 8 
NONIONIZABLE 
(POL) 


tCC 


YES 


— YciA" 


-■ YeA 


TOG ■ 


YeS 






CYSTEINE 0 






ASPARAGINE 0 






GLUIAMINE 0 






TYROSINE 0 


ACT 


YES 


THREONINE 4 


ACC 




ACa 


VeS 


Xcti 








ASPARTICACID 0 


lONIZABLE AODIC 0 
NEGATIVE CHARGE 
(NEC) 






GLUTAMIC ACIO 0 






LYSINE 0 


lONIZABLE: BASIC 0 
POSITIVE CHARGE 
(POS) 






ARGININE 0 






HISHDINE 0 






STOP CODON 0 


STOPSIGNAL 0 
(STP) 
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TABLE 54. Mutagenic Cassette; N, G, N 



CODON 


RcprcMoted 


AMINO ACm (Frcqnency) 


CATEGORY (Fraqucncr) 


GGT 


YES 


GLYCINE 4 


NONPOLAR 5 
(NPL) 


GGC 


YES 




m 


GGG 


YES 






ALANINE 0 






VALINE 0 






LEUCINE 0 






ISOLEUONE 0 






METHIONINE 0 






PHENYLALANINE 0 


TGG 


YES 


TRYPTOPHAN 1 






PROLINE 0 


ACT 


YES 


SERINE 2 


POLAR 4 
NONIONIZABLE 
(POL) 




Ves"' 


TGT 


YES 


CYSTEINE 2 


TGC 









ASPARAGINE 0 






GLUTAMINE 0 






TYROSINE 0 






THREONINE 0 






ASPAIIXICACID 0 


lONIZABLE ACIDIC 0 
NEGATIVE CHARGE 
(NEG) 






GLUTAMIC ACID 0 






LYSINE 0 


iONJZABLE: BASIC 6 
POSITIVE CHARGE 
(POS) 


CGT 


YES 


ARGININE 6 


CGC 


YES 


C<jA 


Ves 


ta<j 


YES 


AQA 


YES 


AGG 


YeS 






HISTIDINE 0 


TGA 


YES 


STOP CODON I 


STOP SIGNAL 1 
(STP) 
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TABLE 55. Mutogenic Cassette; N, T, N 



CODON 


Represented 


AMINO ACID 


(FrcqDcncy) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONPOLAR 


16 






ALANINE 


0 


(NPL) 




GTT 


YES 


VALINE 


4 






GTC 


YES 










GTA 


YES 










GTG 


YES 










TTA 


YES 


LEUCINE 


6 






■rrti 


YES 










CTT 


YES 










CTC 


YES 























Ct6 


YES 










ATT 


YES 


ISOLEUCINE 


3 






>TC 


YES 










ATA 












ATG 




METHIONINE I 






TTT 


YES 


PHENYLALANINE 


2 






TIC 


YES 














TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 
NONIONUABLE 
(POL) 


0 






CYSTEINE 


0 








ASFARAGINE 


0 








GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTICACID 


0 


lONlZABLE: ACIDIC 


0 






GLUTAMIC ACID 


0 


NEGATIVE CHARGE 
(NEC) 








LYSINE 


0 


lONlZABLE: fiASIC 


0 






ARGININE 


0 


POSITIVE CHARCS 
(POS) 








HISTIDINE 


0 








STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 
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,A/C,N 



TOTAL 



CODON 


Represented 


AMINO ACID (Freqaency) 


CATEGORY (Freqnency) 






GLYCWB 0 


NONPOLAR S 
(NPL) 


OCT 


YES 


ALANINE 4 


GCC 


YES 


GCA 


YES 


GCG 


YES 






VALINE 0 






LEUCINE 0 






ISOLEUONE 0 






METHIONINE 0 






PHENYLALANINE 0 






TRYPTOPHAN 0 


CCT 


YES 


PROLINE 4 


COC 


YES 


CCA 


YES 


CCG 


YES 


TOT 


YES 


SERINE 4 


POLAR 14 
(POL) 


TCC 


YES 


Tc!a" 




TCC 


Yti 






CYSTEINE 0 


AAT 


YES 


ASPARAGINE 2 


AAC 


YES 


CAA 


YES 


GLUtAMINE 2 


CaC 


YES 


TAT 


YES 


TYROSINE 2 


TAG 


Yts 


ACT 


YES 


THREONINE 4 


ACC 


YES 


ACA 


Ves 


Atd 


VeS 


GAT 


yes 


ASPAHnCAaO 2 


I0N12ABLE; ACIDIC 4 
NEGATIVE CHARGE 
(NEG) 




Ves 


GAA 


YES 


GLUTAMIC ACID 2 


GAG 


YES 


AAA 


YES 


LYSINE 2 


lONIZABLE: BASIC 4 
POSITIVE CHARGE 
(POS) 


AAG 


YES 






ARGININE 0 


CAT 


YES 


HISTIOINE 2 


CAC 


■■■ Ves 


. TAA 


YES 


STOP CODON 2 


STOP SIGNAL 2 
(STP) 


TAG 


YES 
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TABLE 57. Mntagenic Cassette; N, A/G, N 



COOON 


RftprucMtcd 


AMINO ACID (FrequeBcy) 


CATCGORY (FmpMiicy) 


GGT 


YES 


GLYCINE 4 


NONPOLAR 3 
(NPL) 








m 




m 






ALANINE 0 






VALINE 0 






LEUONB 0 






ISOLEUCINE 0 






MPTHTrtMTMP n 
iviEimuniPic V 






PHENYUUJININE 0 


TCG 




1 Kir lUi^nAn i 






PROUNE 0 


AGT 


YES 


SERINE 2 


POLAR 10 
NONIONIZABLE 
(POL) 


ACX. 


Yea 


TGT 


YES 


CYSTEINE 2 


TGC 


— "Vet! ■ ■ 


AAT 


YES 


ASPARAGINE 2 


AAC 


YES 


CAA 


YES 


GLUTAMINE 2 


Ca6 


YES 


TAT 


YES 


TYROSINE 2 


TAt 


YES 






THREONINE 0 














GAT 


YES 


ASPARnCAClD 2 


lONlZABLE: ACIDIC 4 
NEGAnVE CHARGE 
(NEC) 


dAC 


YES 


GAA 


YES 


GLUTAMIC ACID 2 


GAG 


YES 


AAA 


YES 


LYSINE 2 


lONlZABLE: BASIC 10 
POSITIVE CHARGE 
CPOS) 


AAG 


YES 


CGT 


YES 


ARCININE 6 




YeS 


ea 


Yl^ 




Ves 


AGA 


YES 




yeS 


CAT 


YES 


HISTIDINE 2 


CAC 


YES 


TAA 


YES 


STOPCODON 3 


STOP SIGNAL 3 
(STP) 


TAG 


YES 


TGA 


YES 
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TABLE 58. Mutagenic Cassette; N, ATT, N 



CODON 


Acpreseatcd 


CATEGORY (Fnqncncy) 


CATEGORY (Frequency) 






GLYCINE 0 


NONPOLAR 16 
(NPL) 






ALANIKE 0 


GTT 


YES 


VAUNE 4 


CTC 


YES 


crtA 


YES 


CTG 


YES 


TTA 


YES 


LEUCINE 6 


m 


Ves 


CTT 


YES 


CTC 


YES 


ttA 


YES 


tr6 


YeS 


ATT 


YES 


ISOLEUONE 3 


ATC 


YES 


ATA 


YES 


ATG 


YES 


METHIONINE 1 


ITT 


YES 


PHENYLALANINE 2 


rrc 


YES 






TRYPTOPHAN 0 






PROLINE 0 






SERINE 0 


POLAR 6 
NONIONIZABLE 
(POL) 






CYSTEINE 0 


AAT 


YES 


ASPARAGINE 3 


AAC 


YES 


CAA 


YES 


GLIHAMINE 2 


GAG 


Ves 


TAT 


YES 


TYROSINE 2 


tAt 








THREONINE 0 


GAT 


YES 


ASPARTICAaD 2 


lONIZABLE: ACIDIC 4 
NEGATIVE CHARGE 
(NEG) 


GAC 


yes 


GAA 


YES 


GLUTAMIC AOD 2 


ga6 


m 


AAA 


YES 


LYSINE 2 


lONlZABLE: BASIC 4 
POSITIVE CHARCX 
(POS) 


- aa6 


YESi 






ARCININE 0 


CAT 


YES 


HlSnoINE 2 


CAC 


YES 


TAA 


YES 


STOP CODON 2 


STOP SIGNAL 2 
(STP) 


TAG 


YES 
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TABLE 59. Mutagenic Cassette: N, C/G, N 



CODON 


Rcprtscnted 


AMINO ACID 


(Frequeiwy) 


CATEGORY (Frtqncncy) 


GOT 


YES 


GLYCINE 


4 


NONPOLAR 13 


(kid 


YES 






(NPL) 


GGA 


YES 








GGC 


YES 








OCT 


YES 


ALANINE 


4 




GCC 


YES 








GCA 


YES 








GCG 


YES 












VALINE 


0 








LEUCINE 


0 








ISOLEUCINE 


0 








METHIONINE 


0 








PHENYLALANINE 


0 




TCC 


YES 


TRYPTOPHAN 1 




OCT 


YES 


FJtOUNE 


4 




CCC 


YES 








CCA 


YES 








CCG 


YES 








TCT 


YES 


SERINE 


6 


POLAR 12 


Tt*, 


YES 






NONIONIZABLE 
(POL) 


TCa 


?e5 " 






TCg 











ACT 


VeS 








A&C 


V£5 








TGT 


YES 


CYSTEINE 


2 




TGC 


YeS 












ASPARAGINE 


0 








GLUXAMINE 


0 








TYROSINE 


0 




ACT 


YES 


THREONINE 


4 




AfcC 


YES 








AtA 


- ■ VeS 








ACG 


YES 












ASPARnCAClD 


0 


lONIZABLE: AQDIC 0 
NEGATIVE CHARGE 
(NEC) 






GLUTAMIC ACID 


0 






LYSINE 


0 


lONIZABLE: BASIC 6 
POSrriVE CHARGE 
(POS) 


CCT 


YES 


ARGININE 


6 


tat 


YES 






CflA 


m 








cat 


YES 








AGA 


YES 








AGti 


vts 












mSTIDINE 


0 




TGA 


YES 


STOP CODON 1 


STOP SIGNAL 1 
(STP) 
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TABLE 60. Mutagenic Cassette; N, C/T, N 



CODON 


Rcpnscnted 


AMINO ACID (Frtqacttcy) 


CATEGORY (Frequency) 






GLYCINE 0 


NONPOLAR 24 
tNPL) 


OCT 


YES 


ALANINE 4 


GCC 


YES 


GCA 


YES 


ceo 


YES 


GtT 


YES 


VALINE 4 


GTC 


YES 


GTA 


YES 


OTO 


YES 


TEA 


YES 


LEUCINE 6 


•rtiS 


vts 


' err 


YES 


CTC 


Ves 


tlX 


VeS 


ct6 


YES 


ATT 


YES 


ISOLEUONE 3 


ATC 


YES 


KTA 


YES 


ATG 


YES 


METHIONINE I 


TTT 


YES 


PHENYLALANINE 2 


TTC 


YES 






TRYPTOPHAN 0 


CCT 


YES 


PROLINE 4 


ccc 


YES 


CCA 


YES 


CCG 


YES 




YES 


SERINE 4 


POLAR 8 
NONIONEABLE 
(POL) 


TCC 


VeS 


TCA 


■ YES 


TCG 








CYSTEINE 0 






ASPARAGINE 0 






GLUTAMINE 0 






TYROSINE 0 


ACT 


YES 


THREONINE 4 


ACC 


YES 


ACiA 


YES 


Ate 


VeS 






ASPAKnCAOD 0 


lONlZABLE: ACIDIC 0 
NEGATIVE CHARGE 
(NEG) 






GLUTAMIC AQD 0 






LYSINE 0 


lONlZABLE BASIC 0 
POSITIVE CHARGE 
(POS) 






ARGININE 0 






HISTIDINE 0 






STOP CODON 0 


STOPSIQ^ 0 
' (STP) 
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TABLE 61. Mutagenic Cassette; N, G/T, N 



CODON 


ReprcscnUd 


AMINO ACiD 


(Frequency) 




OCT 


YES 


GLYCINE 


4 


NONPOLAR 2) 


GGC 


YES 






(NPL) 


GGA 


YES 








GGG 


Ves 












ALANINE 


0 




GTT 


YES 


VAUNE 


4 




CTC 


YES 








GTA 


YES 








GTG 


YES 








TTA 


YES 


LEUCINE 


6 




frci 


VeS 








err 


YES 








CTC 


Ve^ 








ttA 


Ve£ 








CTG 










ATT 


YES 


ISOLEUONE 


3 




ATC 


YES 








ATA 


YES 








ATG 


YES 


METHIONINE I 




TTT 


YES 


PHENYLALANINE 


2 




TTC 


YES 








TGG 


YES 


TRYPTOPHAN I 








PROLINE 


0 




ACT 


YES 


SERINE 


2 


POLAR 4 


AuC 








NONIONIZABLE 
(POL) 


TGT 


YES 


CYSTEINE 


2 


TGC 


Ves' ' ■ ' 












ASPARAGINE 


0 








GLUTAMINE 


0 








TYROSINE 


0 








THREONINE 


0 








ASPARTICAOD 


0 


lONIZABLE: ACIDIC 0 
NEGATIVE CHARGE 
(NEC) 






GLUTAMIC ACID 


0 






LYSINE 


0 


lONlZABLE: BASIC 6 
POSITIVE CHARGE 
(POS) 


OGT 


YES 


ARGININE 


6 


tat 








CGA 


Ves 








CGG 


YES 








AGA 


YES 








AGG 


YES 












raSTlDINE 


0 




TGA 


YES 


STOP CODON I 


STOP SIGNAL . 1 
(STP) 
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TABLE 62. Mutagenic Cassette; N, A/C/G, N 



CODON 


RepFCfcnted 


AMINO ACID (Freqncncy) 


CATEGORY (Frequency) 


GOT 


YES 


GLYdKE 4 


NONPOLAR U 
(NPL) 




YfiS 


GGA 


YES 


GGG 


YES 


OCT 


YES 


ALANINE 4 


GCC 


YES 


CCA 


YES 


OCC 


YES 






VALINE 0 






LEUCINE 0 






ISOLELONE 0 






METHIONINE 0 








TGG 


YES 


TRYPTOPHAN 1 


OCT 


YES 


PROLINE 4 


CCC 


YES 


CCA 


YES 


CCG 


YES 


TCT 


YES 


SERINE 6 


POLAR 18 
NONIONIZABLE 
(POL) 


TCC 


YES 


TCA 




TCG 


YES 


ACT 


YES 


— xcc — 


Ves 


TCT 


YES 


vYaTluND 2 





?ES 


AAT 


YES 


ASPARAGINE 2 


AAC 


YcS 


CAA 


YES 


GLUXAMINE 2 


CAC 


YES 


TAT 


YES 


TYROSINE 2 


TAC 


YES 


ACT 


YES 


THREONINE 4 


ACC 


ttA 


ACA 


YES 


ACQ 


YES 


GAT 


YES 


ASPARTICAOD 2 


lONIZABLE: ACIDIC 4 
NEGATIVE CHARGE 
(NEC) 




Y]^ 


CAA 


YES 


GLinAhQC ACID 2 


GAG 


Ves 


AAA 




LYSINE 2 


lONIZAfiLE: BASIC 10 
POSITIVE CHARGE 
(POS) 


AA<5 


VEi 


GOT 


YES 


ARGININE 6 


CCC 


— 


C6a 




cci; 


YES 


AGA 


YES 


A<±C 


m 


CAT 


YES 


HISTIDINE 2 


tAt 


YES 


TAA 


YES 


STOP CODON 3 


STOP SIGNAL 3 
(STP) 


TAG 


YES 


TGA 


YES 




4S 
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TABLE 63. Mutagenic Cassette; N, A/C/T, N 



TOIAL 



CODON 


Rtprcscnted 


AMINO ACID 


(Freqncncy) 


CATEGORY (FreqBency) 






GLYCINE 


0 


(NPL) 





YES 


ALAhONE 


4 


GCC 


YES 








?^ - 


YES 








GCu 


YES 








OTT 


YES 


VAUNE 


4 




GTC 


YES 








CIA 


YES 








GTG 


YES 








TTA 


YES 


LEUCINE 


6 




— ¥¥?r ■ 


Vii 








CTT 


YES 








CTC 


YES 








CTA 


m 








er6 


YEi 








ATT 


YES 


ISOLEUONE 


3 




Arc 


YES 








AIA 


YES 








ATG 


YES 


METHIONINE 1 




111 


YES 


PHENYLALANINE 


2 




TTC 














TRYPTOPHAN 


0 




ccr 


YES 


PROLINE 


4 




OCC 


YES 








CCA 


YES 








CCG 


YES 








TCT 


YES 


SERINE 


4 


POLAR 14 
NONIONIZABLE 
(POL) 


tec 


yeS 






TCA 


YES 






TCG 


Yes 












CYSTEINE 


0 




AAT 


YES 


ASPARAGINE 


2 




AAC 


YES 








CAA 




GLUTAMINE 


2 




CAG 


YES 








TAT 


YES 


TYROSINE 


2 




TAC 


YES 








ACT 


YES 


THREONINE 


4 




ACC 


YES 








ACA 


YES 








AtcJ 


YES 








GAT 


YES 


ASPARTICACID 


2 


lONIZABLE: AODIC 4 
NEGATIVE CHARGE 
(NEG) 




VeS 






CAA 


YES 


GLUTAMIC ACID 


2 




YES 








AAA 


YES 


LYSINE 


2 


lONIZABLE: BASIC 4 
POSmVE CHARGE 
(POS) 


AA(j 












ARGININE 


0 


CAT 


YES 


HISTIDINE 


2 




CAC 


YES 








TAA 


YES 


STOPCOiX)N 


2 


STOP SIGNAL 2 
(STP) 


TAG. 


YES 
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TABLE 64. Mutagenic Cassette; N, A/G/T, N 



CODON 


RcpracBted 






CATEGORY (Frequcn^) 


GOT 


Y£S 


GLYONE 


4 


nUnrULAK * 1 
(NPL) 


GGC 


we — " 

Yea 






GGA 










gg<}" 


Ves 












ALANINE 


0 




GTT 


YES 


VALINE 


4 




GTC 


YES 








GTA 


YES 








GTG 


YES 








TTA 


YES 


LEUCINE 


6 




TTG 










CTT 


Ve^ 








Ctc 


YES 








CTA 


ViS 








cr6 


Ves 








ATT 


YES 


ISOLEUCINB 


3 




ATC 


YES 








ATA 


YES 








ATG 


YES 


METHIONINE 1 




TTT 


YES 


PHENYLALANINE 


2 




TTC 


YES 








TOO 


YES 


TRYPTOPHAN 1 








PROUNE 


0 




AGT 


YES 


SERINE 


3 


POLAR 10 
NONIONtZABLE 
(POL) 


ASiT' 


YES 






TGT 




CYSTEINE 


2 


TOC 


Ves 








AAT 


YES 


ASPARAGINE 


2 




AAC 


YES 








CAA 


YES 


GLUXAMINE 


2 




CAG 


YES 










YES 


TYROSINE 


2 




TAC 


YES 












THREONINE 


Q 




GAT 


YES 


ASPARnCAOD 


2 


lONlZAELE: ACIDIC 4 
NEGATIVE CHARGE 
(NEG) 


GAC 


YEi 






GAA 


YES 


GLUTAMIC AOO 


2 


tiAC 


VeS 








AAA 


YES 


LYSINE 


2 


lONiZABLE: EASIC 10 
' POSITIVE CHARGE 
(POS) 


AAti 


YES 






c5t'"' ■ 


YES 


ARGININE 


6 


CGC 










o6a 


Ves 








OGG 


Ves 








aga 


Vfes 








aW 


YES 








CAT 


YES 


HISTIDINE 


2 




CAC 


YiS 








TAA 


YES 


STOPCODON 


■ 3 


(STP) 


TAG 


YES 






TGA 


YES 
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TABLE 65. Mutagenic Cassette; N, C/G/T, N 



CODON 


Rcpracntcd 


AMINO ACID (FrequcBcr) 


CATEGORY <F«qiiency) 


GOT 


YES 


GLYCOL 4 


NONPOLAR 29 
(NPL) 


oSt' 


■ Y£ff 


gga" 


YES 


CiGG 


YES 


OCT 


YES 


ALANINE 4 


GCC 


YES 


GCA 


YES 


GCG 


YES 


GTT 


YES 


VAUNE 4 


GTC 


YES 


GTA 


YES 


OTO 


YES 


TTA 


YES 


LEUCINE 6 


■ Tt6 


YES 


CTT 


YES 


irt 


YES 


ttA 


VE5 


" CTG 


YES 


ATT 


YES 


ISOLEUCINE 3 


Arc 


YES 


ATA 


YES 


ATG 


YES 


METHIONINE t 


rrr 


YES 


PHENYUOANINE 2 


TTC 


YES 


TGG • 


YES 


TRYPTOPHAN 1 


CCT 


YES 


PROLINE 4 


ccc 


YES 


CCA 


YES 


CCG 


YES 


TCT 


YES 


SERINE 6 


POLAR 12 
NONIONIZAfiLE 
(POL) 


TCC 






?E§ 




VES 




YES ~ " 




YES 


TGT 


YES 


CYSTEINE 2 


iOC 














GLUTAMINE 0 








ACT 


YES 


■niREONINE 4 


AOC 


Ves 


ACA 




ACfl 








ASPARriCAClD 0 


lONIZABLEiAODIC 0 
NEGATIVE CHARGE 
(NEG) 






GLUTAMIC AQD 0 






LYSINE 0 


lONIZABLE: BASIC 6 
POSITIVE CHARGE 
(POS) 


CGT 


YES 


ARGININE 6 


CGC 


YES 


CGA 


YES 


■ 


VeS 


a(5a 


YEi 


AGG 


YES 






HISTIDINE 0 


TGA 


YES 


STOP CODON i 


STOP SIGNAL 1 
(STT) 




4S 


13 Ambi* Adds An Rcprcscnud 


NPL: POL: NEC:POS: STP - 
29: 12: 0: <: 1 
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TABLE 66. Mutagenic Cassette: C, C, N 



CODON 




AMINO ACID (Frequency) 


CATEGORY (Frequency) 






GLYCI^f£ 0 


NONPOLAR 4 
(NPL) 






ALANlNc 0 






VALINE 0 






LBUONE 0 






ISOLEUONE 0 






METHIONINE 0 






PHENYLALAKINE 0 






TKYPTOrliAN 0 


CCT 


YES 


PROLINE 4 


ccc 


YES 


CCA 


YES 


CCG 


YES 






SERINE 0 


POLAR 0 
NONIONIZABLE 
(POL) 






CYSTEINE 0 






ASPARAGIHE 0 






GLUTAMINE 0 






TYROSINE 0 






THREONINE 0 






ASPAimC ACID 0 


lONIZABLE: ACIDIC 0 
NEGATIVE CHARGE 
(NEG) 






GLLTTAMIC ACID 0 






LYSINE 0 


lONIZABLE: BASIC 0 
POSITIVE CHARGE 
(PCS) 






ARGININE 0 






HISTIDINE 0 






STOP CODON 0 


STOP SIGNAL 0 
(ST?) 






1 Aniiio Add Is Rcprcscotdl 


NPL: POL: N£G:POS: STP- 
4: 0: 0: 0: 0 


tagenlc Casiette: G, 


G»N 






CODON 


Rcpntentcd 


AMINO ACID (Frcqvency) 


CATEGORY (Frequency) 


GGT 


YES 


GLYCINE 4 


NONPOLAR 4 
(NPL) 


GGC 


Ves 


GGA 


YES 


GGG 


YES 






ALANINE 0 






VALINE 0 






LEUCINE 0 






ISOLEUCINE 0 






METHIONINE 0 






PHENYLALANINE 0 






TRYPTOPHAN 0 






PROLINE 0 






SERINE 0 


POLAR 0 
NONIONIZABLE 
(POL) 






CYSTEINE 0 






ASPARAGINE 0 






GLUTAMINE 0 






TYROSINE 0 






THREONINE 0 






ASPARTlCAaD 0 


lONIZABLE: ACIDIC 0 
NEGATIVE CHARGE 
(NEG) 






GLUTAMIC ACID 0 






LYSINE 0 


lONIZABLE: BASIC 0^ 
POSITIVE CHARGE 
(POS) 






ARGININE 0 






HISTIDWE 0 






STOP CODON 0 


STOP SIGNAL 0 
(STP) 


0 


4 


1 AaiDO AcU b Represented 


NPL: POL: NECrPOS: STP- 
4: 0: 0: 0: 0 
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TABLE 68. Mutagenic Cassette; G, C, N 



CODON 


Acprcscnted 


AMINO ACID 


(Freqneocy) 


CATEGORY 


(Freqacncy) 












4 




YES 




4 


(NPL) 




occ 


YES 










GCA 


YES 










GCG 


YES 














VALINE 












LcUUNc 


0 












Q 










METHIONINE 


0 










racn I laAiaATimc 


Q 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POl-AR 


0 






CYSTEINE 


0 


NONIONIZABLE 
(POL) 








ASPARAGINE 


0 








GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


- .0 










ASPARnCAOD 


0 


lONIZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEC) 


0 






GLUTAMIC ACID 


• 0 








LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 
OOS) 


0 






ARGININE 


0 








HISTIDINE 


0 








STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


1 Amino Add b Rcprtacnted 


NPL: POL: NEGiPOS: STP - 
4: 0: 0: 0: 0 



CODON 


Rcpfocstcd 


AMINO ACID 


<Frei|ttrsty) 


CATEGORY 


(Frcqaency) 






GLYCINE 


0 


NONPOLAR 


4 






ALANINE 


0 


(NPL) 




GTT 


YES 


VALINE 


4 






GTC 


YES 










GTA 


YES 










GTG 


YES 














LEUCINE 


0 










ISOLEUONE 


0 










METH10ND4E 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 


0 






CYSTEINE 


0 


NONIONIZABLE 
(POL) 








ASPARAGD4E 


0 








GLUIAMmE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTICAOD 


0 


lONIZABLE: ACmiC 
NECXnVE CHARGE 
(NEG) 


0 






GLUTAMIC AQD 


0 








LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 
(POS) 


0 






ARGININE 


0 








HISTmiNE 


0 








STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


1 Aodno Add ll Rq 


irciUitcd 


NPL: rOL: NEC 
4: 0: 0: 


:POS; STP- 
0: 0 
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TABLE 70. Mutagenic Cassette: G, N 



TOTAL 



COOON 


Represented 


AMINO ACID 


(Freqaeney) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


0 






ALANINE 










VAUNE 


0 










LEUCINE 


0 






















METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 


0 






CYSTEINE 


0 


NONIONIZABLE 
(POL) 








ASPARAGINE 


0 








CUHAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTICAaD 


0 


lONIZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEC) 


0 






GLUIAMICACID 


0 








LYSWE 


0 


lONIZABLE: BASJC 
POSITIVE CHARGE 
(PCS) 


4 


CGT 


YES 


ARGININE 


4 




<:6t 


YES 








C6A 


YES 










tck) 


Vti 














HISTIDINE 


0 










STOPCODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


1 AhIqo Add Is Rcpnscotcd 


NPU POL: N£G:POS: STP- 
0: 0: (h 4; 0 



TABLE 71. MtfUgenlc Canette: C« T, N 



CODON 


Rcproented 


AMINO ACID 


{Frequeacy) 


CATEGORY 


(Frtqacncy) 






GLYCINE 


0 


NONPOLAR 


4 






ALANINE 


0 


(NPL) 








VALINE 


0 






err 


YES 


LEUCINE 


4 






CTC 


YES 










ttA 


Vts 










CTG 


YES 














ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 
NONIONIZABLE 


0 






CYSTEINE 


0 








ASPARAGINE 


0 


(POL) 








CLUTAMINE 


0 










TYROSINE 


o" 










THREONINE 


0 










ASPAKTIC AQO 


0 


lONIZABLE: ACIDIC 
NEGAnVE CHARGE 
(NEC) 


0 






GLUTAMIC AOD 


0 








LYSINE . 


0 


lONIZABLE: BASJC 
POSITIVE CHARGE 
(POS) 


0 






ARGININE 


0 








HISTIDINE 


0 








STOPCODON 


0 


STOP SIGNAL 
(STP) 


0 




'4 


1 Amiao Add Is Represented 


NPL: POL: N£G:POS: STP • 
4: 0: 0: 0: 0 
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TABLE 72, Mutagenic Cassette; T, C, N 



CODON 


RepKScnlcd 


AMINO ACID 


(Frequency) 


CATEGORY 




(Frequency) 






GLYCINE 


0 


NONPOLAR 




0 






ALANINE 


0 












VAUNE 


0 












LEUCINE 


0 












ISOLEUONE 


0 












METHIONINE 


0 












PHENYLALANINE 


0 












TRYPTOPHAN 


0 












PROLINE 


0 








TCT 


YES 


SERINE 


4 


POLAR 




4 


TCC 


YES 






N0NI0NI2ABLC 






TCA 


YES 






(POL) 








V£S 
















CYSTEINE 


0 












ASPARAGINE 


0 












OLUTAMINE 


0 












TYROSINB 


0 












THREONINE 


0 












ASPAHTICACID 


0 


lONlZABLE: ACIDIC 
NEGATIVE CHARGE 
' (NEG) 




0 






GLUTAMIC AOO 


0 










LYSINE 


0 


lONIZABLE: BASIC 




P 






ARGININE 


0 


POSmVE CHARGE 
(PCS) 










HISTIDINE 


0 










STOPCODON 


0 


STOP SIGNAL 
(STP) 




0 




4 


1 Amino Acid li Represented 


NPU rOL: NEC:POS:STP- 
0: 4: 0: 0: 0 


Ue«ik Caasette: A, 








CO DON 


Represented 


AMINO ACID 


(Frequency) 


CATEGORY 




(Frequency) 






GLYCINE 


0 


NONPOLAR 




0 






ALANINE 


0 


(NPL) 










VAUNE 


0 












LEUCD4E 


0 












ISOLEUONE 


0 












METHIONINE 


0 












PHENYLALANINE 


0 












TRYPTOPHAN 


0 












PROUNE 


0 












SERINE 


0 


POLAR 
NONIONIZABLE 
(POL) 




4 






CYSTEINE 


0 










ASPARAGINE 


0 










GLUIAMWE 


0 












TYROSINE 


0 








ACT 


YES 


THREONINE 


4 








Att 


YfiS 












AtA 


- " Yes 












AOG 


Yes 
















ASPARTICACID 


0 


lONlZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEG) 




0 






GLUTAMIC ACID 


0 










LYSINE 


0 


lONIZABLE: BASIC 




0 






ARGININE 


0 


POSITIVE CHARGE 
(POS) 










HISTIDINE 


0 










STOP CODON 


0 


STOP SIGNAL 
(STP) 




0 




4 


1 Amino Add b Represented 


NPL: POL: NEGiPOS: STP - 
0: 4: 0: (h 0 
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CODON 


Bcpratnted 


AMINO ACID 


(f mjoency) 


CATEGORY 


(FreqoeBCy) 






GLYCENfi 


0 


NONPOLAR 
(NPL) 


0 






ALANINE 


0 








VALINE 


0 










LEUCINE 


0 










ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE - 


0 


POLAR 
NONIONIZABLE 
(POL) 


0 






CYSTEINE 


0 








ASPARAGINE 


0 








GLinCAMINE 


0 










TYROSINE 


0 










THREONINE 


0 






GAT 


YES 


ASPAimCAaD 


2 


lONIZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEC) 


4 


GAC 


Ve^ 








CAA 


YES 


GLUTAMIC ADD 


2 




(±a6 


YES 














LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 
(POS) 


0 






ARGININE 


0 








mSTIDINE 


0 








STOP CODON 


0 


STOP SIGNAL 
(SIP) 


0 




- 4 


2 Anino Adds Are Repnsentcd 


NPL: POL: NEG:POS: STP = 
0: 0: 4: 0: 0 



CODON 


Represented 


AMINO ACID 


(Frequency) 




(Freqnency) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


4 






ALANINE 


0 








VAUNE 


0 










LEUCINE 


0 






ATT 


YES 


ISOLEUONE 


3 






ATC 


YES 










AXA 


YES 










ATG 


YES 


METHIONINE I 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROUNE 


0 










SERINE 


0 


POLAR 
NONIONIZABLE 


0 






CYSTEINE 


0 








ASPARAGINE 


0 


(POL) 








GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTIC ACID 


0 


lONIZABLE: AODIC 
NEGATIVE CHARGE 
(NEG) 


0 






GLUTAMIC ACID 


0 








LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 


0 






ARGININE 


0 








HISTIDME 


0 


(POS) 








STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


3 Amino Adds Are Represented 


NPL: POL; N£G:POS: STP = 
4: 0: 0: (h 0 
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TABLE 76. Mutagenic Cassette; Q A, N 



CODON 


Represented 


AMINO ACID 


(Frequency) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONFOLAR 


0 






ALANINE 


0 


(NPL) 








VALINE 


0 












Q 










lent IPtNP 


Q 






















fncN Yi.ALANXnc 












TRYPTOPHAN 


0 






















SERINE 


0 


POLAR 


2 






CYSTEINE 


0 


NONIONIZABLE 
(POL) 








ASPARAGINE 


0 




CAA 


YES 


GUHAMINE 


2 






CAG 


YES 














TYROSINE 


0 










THREONINE 


0 










ASPARTICACID 


0 


10NI2ABLE: ACUJC 
NEGAnVE CHARGE 
(NEG) 


0 






GLUTAMIC ACID 


0 








LYSINE 


0 


lONIZABLE: BASIC 


2 






ARGININE 


0 


POSITIVE CHARGE 
(POS) 




CAT 


YES 


HISTIOINE 


2 




CAC 


YES 














STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


2 Amlcw Addi Are AepreKnted 


NPL: POL: N£G:POS: STF - 
0: 2: 0: 2: 0 



TAELEn. Motigemlc Cmette; T, X N 



CODON 


Represented 


AMINO ACID 


(Frequency) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


4 






ALANINE 


0 








VALINE 


0 






TTA 


YES 


LEUCINE 


2 








YES 














ISOLEUONE 


0 










METHIONINE 


0 






TTT 


YES 


PHENYLALANINE 


2 






TTC 


YES 














TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 
NONIONIZABLE 
(POL) 


0 






CYSTEINE 


0 








ASPARAGINE 


0 








GLUtAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTICACID 


0 


lONIZABLE: AQDIC 
NEGATIVE CHARGE 
(NEG) 


0 






GLUTAMIC ACID 


0 








LYsme 


0 


lONIZABLE: BASIC 
POSmVE CHARGE 
(POS) 


0 






ARGININE 


0 








HISTIDINE 


0 








STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


2 Amlao Adds Arc Represented 


NPL: POL: NEG:POS! STP- 
4: 0: 0: 0: 0 
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TABLE 78> Mutagenic Cassette: A, A, N 







AM nun Acm 


(Fref)UCBcy) 




(Frajttciicy) 






uLYUNc 


0 










ALANINE 


0 


(NFL) 








VALINE 


0 










LEUCINE 


0 










ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 


2 






CYSTEINE 


0 


NONIONIZABLE 




AAT 


YES 


ASPARAGINE 


3 


(POL) 




AAC 


YES 














GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTIC ACID 


0 


lONlZABLE: ACIDIC 


0 






GLLTTAMICACID 


0 


NEGATIVE CHARGE 
(NEG) 




AAA 


YES 


LYSINE 


2 


lONIZABLE: BASIC 


2 


AAG 


YES 






POSITIVE CHARGE 
(POS) 








ARGININE 


0 








HISTIDINE 


0 










STOPCODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


2 Amino Adds An Rcpnsenled 


NPL: POL: NEG:POS: ST?- 
0: 2: (h 2: 0 



CODON 


Rqmscntcd 


AMINO ACID 


(Fnqttcncy) 


CATEGORY 


(Frequuicy) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


0 






ALANINE 


0 








VALINE 


0 










LEUCINE 


0 










ISOLEUONE 


0 










METHIONINE 


0 










PHENYLA.LANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POIAR 
NONIONIZABLE 


2 






CYSTEINE 


0 








ASPARAGINE 


0 


(POL) 








GLUTAMINE 


0 






TAT 


YES 


TYROSDflB 


2 






TAC 


Ves 














THREONINE 


0 










ASPARTICAOD 


0 


lONIZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEG) 


0 






GLUTAMIC AOO 


0 








LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 
(POS) 


0 






ARGININE 


0 








HISTIDINE 


0 




TAA 


YES 


STOP CODON 


2 


STOP SIGNAL 
(STP) 


2 


TAC 


YES 










4 


1 Amioo Add is ReprcMatcd 


NPL: POL: N£G:POS: STP- 
0: 2: 0: 0: 2 
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TABLE 80. Mutagenic Cassette; T, G, N 



TOTAL 



CODON 


Repmcatcd 


AMINO AC10 


(Freqveacy) 


CATEGORY 


(Frequency) 






GLYCINE 


0 




I 






ALANINE 


0 


(NPL) 








VALINE 


0 










LEUCINE 


0 










ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 






TOG 


YES 


TRYPTOPHAN 1 










PROLINE 


0 










SERINE 


0 


POLAR 


2 


TGT 


YES 


CYSTEINE 


2 


NONIONIZABLE 
(POL) 




TCC 


VEi 












TYROSINE 


0 










THREONINE 


0 










ASPARTICAaO 


0 


lONlZABLE: ACIDIC 


0 






CUnXMICAOD 


0 


NEGATIVE CHARGE 
(NEC) 








LYSINE 


0 


lONiZABLE: BASIC 


0 






ARGININE 


0 


POSITIVE CHARGE 
(POS) 


















HISTIDINE 


0 






TGA 


YES 


STOP CODON 1 


STOP SIGNAL 1 
(STP) 




4 


1 Amtno Adds Are Rcprcsenled 


NPL: POL: NEGiPOS: STP- 
1: 1: 0: 0: 1 


tagcnlc Cassette: A, 








CODON 


Represented 


AMINO ACID 


(Frequency) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


0 






ALANINE 


0 








VALD4E 


0 










LEUCINE 


0 










ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 






ACT 


YES 


SERINE 


2 


POLAR 


2 


AGC 


Ve^ 






NONIONIZABLE 
(POL) 








CYSTEINE 


0 








ASPARAGINE 


0 










GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTICAOD 


0 


lONIZABLE: ACIDIC 
NEGAITVE CHARGE 
(NEG) 


0 






GLUTAMIC AOD 


0 








LYSINE 


0 


lONIZABLE: BASIC 
POSITIVE CHARGE 
(POS) 


2 


AGA 


YES 


ARCD^E 


2 




AGG 


YES 












HISTIDINE 


0 










STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




4 


2 Amino Adds Arc Represented 


NPU POL: NEG: POS: 
0: 2: 0: 2: 


STP- 

0 
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TABLE 82. Mutagenic Cassette: G/C, G, N 



CO DON 


Rcpraenled 


AJninu AL.lIf 


(Frequency) 


\rAA Zfvvm 


(FrcQ D cncy) 


GOT 


YES 

-™ 


GLYCINE 


4 


NONPOLAR 


4 




xu 






fNPL) 




■ ddA 


YES 










CKKj 


YES 


























VALINE 


0 










LEUCINE 


0 












0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 


0 






CYSTEINE 


0 


NONIONIZABLE 
(POL) 








ASPARAGINE 


0 








GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPAKTICACID 


0 


lONIZABLE: AODIC 
NEGAnVE CHARGE 
(NEC) 


0 






GLUTAMIC AOD 


0 








LYSINE 


0 


lONIZABLfi: BASIC 
POSmVE CHARGE 
(POS) 


4 


COT 


YES 


ARGININE 


4 




cioc 


VES 








c6a 


YES 












Y£S 














HISTIDINE 


0 










STOPCODON 


0 


STOP SIGNAL 
(STD 


0 




8 


2 Amino Adds An Repreicatcd 


NPL: POL: NEG: POS: 
4: 0: 0: 4: 0 


STP- 



CODON 


Represented 


AMINO ACID 


(Frequency) 


CATEGORY 


(Frequency) 






GLYCINE 


0 


NONPOLAR 
(NPL) 


8 


GCT 


YES 


ALANINE 


4 




GCC 


YES 










GCA 


YES 










GCG 


YES 














VALINE 


0 ■ 










LEUCINE 


0 










ISOLEUCINE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 






CCT 


YES 


PROLINE 


4 






GCC 


YES 










CCA 


YES 










CCD 


YES 














SERINE 


0 


POLAR 
NONIONIZABLE 
(POL) 


0 






CYSTEINE 


0 








ASPARAGINE 


0 








GUHAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPARTIC AQD 


0 


lONlZABLE: AQDIC 
NEGATIVE CHARGE 
(NEG) 


0 






GLUTAMIC AOD 


0 








LYSINE 


0 


10NIZABL& BASIC 
POSITIVE CHARGE 
(POS) 


0 






ARGININE 


0 








HISTIDINE 


0 








STOP CODON 


0 


STOP SIGNAL 
(ffTP) 


0 




8 


2 Amino Adds Are Hcpraenud 


NPL: POL: NEG: POS:STP » 
S: Q: 0: 0: 0 
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TABLE 84. Mutagenic Cassette; G/C, A, N 



TOTAL 



TOTAL 



codon 




AMINO ACID 


(Fjrcf{QCiiC]^ 


CATEGORY 


(Frequency) 








0 


NONPOLAR 


0 






ALANINE 




(NPL) 








VALINE 


0 










LEUCINE 


0 






















McTnlUNinc 












PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLINE 


0 










SERINE 


0 


POLAR 


2 






CYSTEINE 


0 


NONTONIZABLE 
fPOLl 








ASPARAGINE 


0 






CAA 


YES 


GLUTAMINE 


2 






CAG 


YES 














TYROSINE 


0 










THREONINE 


0 






GAT 


YES 


ASPARTICACID 


2 


IONIZABLE:AaDlC 


4 


GAC 


YES 






NEGATIVE CHARGE 




GAA 


YES 


GLUlAMJCAaD 


2 


(NEG) 




GAG 


Yea 














LYSINE 


0 


lONlZABLE: BASIC 


2 






ARGININE 


0 


POSITIVE CHARGE 




CAT 


YES 


HISHDINE 


2 


(POS) 




CAC 


YES 














STOP CODON 


0 


STOP SIGNAL 
(STP) 


0 




B 


4 Anlno Adds Ait Rqii 


rocntcd 


NPL: POL: NEG: POS: 
Q: 2: 4: 2: 


0 


tstcnlc Cajsettc: GA 


C,T,N 






CODON 


Rtprcicptcd 


AMINO ACID 


(Frequency) 


CATEGORY 


(Frequency) 






GLYCmE 


0 


NONPOLAR 


8 






ALANINE 


0 


(NPL) 




OTT 


YES 


VALINE 


4 






GTC 


YES 










OTA 


YES 










GTG 


YES 










CTT 


YES 


LEUCINE 


4 






CTC 


Ves 










CTA 


YES 










CTO 


YES 














ISOLEUONE 


0 










METHIONINE 


0 










PHENYLALANINE 


0 










TRYPTOPHAN 


0 










PROLME 


0 










SERINE 


0 


POLAR 
N0I4I0NIZABLE 


0 






CYSTEINE 


0 








ASPARAGINE 


0 


(POL) 








GLUTAMINE 


0 










TYROSINE 


0 










THREONINE 


0 










ASPAKnCAClD 


0 


lONIZABLE: ACIDIC 
NEGATIVE CHARGE 
(NEG) 


0 






GLUTAMIC AOD 


0 








LYSINE 


0 . 


lONIZABLE: BASIC 
POSmVE CHARGE 


0 






ARGIND4E 


0 








HISTIDINE 


0 


(POS) 








STOP CODON 


0 


STOPSlC^iAL 
(STP) 


0 




8 


2 Amino Adds Art Rcpresenlcd 


NPL: POL: NEG: POS: STP - 
8: 0: 0: 0: 0 
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2.11.2. CfflMERIZATIONS 

2.11.2.1 "SHUFFLING" 

5 Nucleic acid shuflfling is a method for in vitro or in vivo homologous 

recombination of pools of shorter or smaller polynucleotides to produce a polynucleotide 
or polynucleotides. Mixtures of related nucleic acid sequences or polynucleotides are 
subjected to sexual PCR to provide random polynucleotides, and reassembled to yield a 
library or mixed population of recombinant hybrid nucleic acid molecules or 
10 polynucleotides. 

in contrast to cassette mutagenesis, only shufiQing and error-prone PCR allow one 
to mutate a pool of sequences blindly (without sequence information other than primers). 

15 The advantage of the mutagenic shuffling of this invention over error-prone PCR 

alone for repeated selection can best be explained with an example from antibody 
engineering. Consider DNA shuffling as compared with error-prone PCR (not sexual 
PCR). The initial library of selected pooled sequences can consist of related sequences of 
diverse origin (i.e. antibodies from naive mRNA) or can be derived by any type of 

20 mutagenesis (including shuffling) of a single antibody gene. A collection of selected 
complementarity determining regions ("CDRs") is obtained after the first roxmd of 
aflSnity selection. In the diagram the thick CDRs confer onto the antibody molecule 
increased afiBnity for the antigen. Shuffling allows the free combinatorial association of 
all of the CDRls with all of the CDR2s with all of the CDR3s, for example. 

25 

This method differs from error-prone PCR, in that it is an inverse chain reaction. 
In error-prone PCR, the nimiber of polymerase start sites and the mmiber of molecules 
grows exponentially. However, the sequence of the polymerase start sites and the 
sequence of the molecules remains essentially the same. In contrast, in nucleic acid 
30 reassembly or shuffling of random polynucleotides the number of start sites and the 
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number (but not size) of the random polynucleotides decreases over time. For 
polynucleotides derived from whole plasmids the theoretical endpoint is a single, large 
concatemeric molecule. 

5 Since cross-overs occur at regions of homology, recombination will primarily 

occur between members of the same sequence fanaily. This discourages combinations of 
CDRs that are grossly incompatible (e.g., directed against different epitopes of the same 
antigen). It is contemplated that multiple families of sequences can be shuffled in the 
same reaction. Further, shuflQing generally conserves the relative order, such that, for 
1 0 example, CDRl will not be foxmd in the position of CDR2. 

Rare shuSlants will contain a large number of the best (eg. highest affinity) CDRs 
and these rare shufflants may be selected based on their superior affinity. 

15 CDRs from a pool of 100 different selected antibody sequences can be pennutated 

in up to 1006 different ways. This large number of permutations cannot be represented in 
a single library of DNA sequences. Accordingly, it is contemplated that multiple cycles 
of DNA shuffling and selection may be required depending on the length of the sequence 
and the sequence diversity desired, 

20 

Error-prone PCR, in contrast, keeps all the selected CDRs in the same relative 
sequence, generating a much smaller mutant cloud. 

The template polynucleotide which may be used in the methods of this invention 
25 may be DNA or RNA. It may be of various lengths depending on the size of the gene or 
shorter or smaller polynucleotide to be recombined or reassembled. Preferably, the 
template polynucleotide is from 50 bp to 50 kb. It is contemplated that entire vectors 
containing the nucleic acid encoding the protein of interest can be used in the methods of 
this invention, and in fact have been successfully used. 
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The template polynucleotide may be obtained by amplification using the PGR 
reaction (USPN 4,683,202 and USPN 4,683,195) or other amplification or cloning 
methods. However, the removal of free primers from the PGR products before subjecting 
5 them to pooling of the PGR products and sexual PGR may provide more efiScient results. 
Failure to adequately remove the primers from the original pool before sexual PGR can 
lead to a low frequency of crossover clones. 

The template polynucleotide often should be double-stranded A double-stranded 
10 nucleic acid molecule is recommended to ensure that regions of the resulting 

single-stranded polynucleotides are complementary to each other and thus can hybridize 
to form a double-stranded molecule. 

It is contemplated that single-stranded or double-stranded nucleic acid 
15 polynucleotides having regions of identity to the template polynucleotide and regions of 
heterology to the template polynucleotide may be added to the template polynucleotide, 
at this step. It is also contemplated that two difiFerent but related polynucleotide templates 
can be mixed at this step. 

20 The double-stranded polynucleotide template and any added double-or 

single-stranded polynucleotides are subjected to sexual PGR which includes slowing or 
halting to provide a mixture of from about 5 bp to 5 kb or more. Preferably the size of 
the random polynucleotides is from about 10 bp to 1000 bp, more preferably the size of 
the polynucleotides is &om about 20 bp to 500 bp. 

25 

Alternatively, it is also contemplated that double-stranded nucleic acid having 
multiple nicks may be used in the methods of this invention. A nick is a break in one 
strand of the double-stranded nucleic acid. The distance between such nicks is preferably 
5 bp to 5 kb, more preferably between 10 bp to 1000 bp. This can provide areas of self- 
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priming to produce shorter or smaller polynucleotides to be included with the 
polynucleotides resulting from random primers, for example. 

The concentration of any one specific polynucleotide will not be greater than 1% 
5 by weight of the total polynucleotides, more preferably the concentration of any one 

specific nucleic acid sequence will not be greater than 0.1% by weight of the total nucleic 
acid. 

The number of different specific polynucletides in the mixture will be at least 
10 about 100, preferably at least about 500, and more preferably at least about 1000. 

At this step single-stranded or double-stranded polynucleotides, either synthetic or 
natural, may be added to the random double-stranded shorter or smaller polynucleotides 
in order to increase the heterogeneity of the mixture of polynucleotides. 

15 

It is also contemplated that populations of double-stranded randomly broken 
polynucleotides may be mixed or combined at this step with the polynucleotides from the 
sexual PCR process and optionally subjected to one or more additional sexual PCR 
cycles. 

20 

Where insertion of mutations into the template polynucleotide is desired, 
smgle-stranded or double-stranded polynucleotides having a region of idaitity to the 
template polynucleotide and a region of heterology to the template polynucleotide may be 
added in a 20 fold excess by weight as compared to the total nucleic acid, more 
25 preferably the single-stranded polynucleotides may be added in a 10 fold excess by 
weight as compared to the total nucleic acid. 

Where a mixture of different but related template polynucleotides is desired, 
populations of polynucleotides from each of the templates may be combined at a ratio of 
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less than about 1 :100, more preferably the ratio is less than about 1 :40. For example, a 
backcross of the wild-type polynucleotide with a population of mutated polynucleotide 
may be desired to eliminate neutral mutations (e.g., mutations yielding an insubstantial 
alteration in the phenotypic property being selected for). In such an example, the ratio of 
5 randomly provided wild-type polynucleotides which may be added to the randomly 
provided sexual PGR cycle hybrid polynucleotides is approximately 1 :1 to about 100:1, 
and more preferably from 1 : 1 to 40:1 . 

The mixed population of random polynucleotides are denatured to form 
1 0 single-stranded polynucleotides and then re-annealed. Only those single-stranded 

polynucleotides having regions of homology with other single-stranded polynucleotides 
will re-anneal. 

The random polynucleotides may be denatured by heating. One skilled in the art 
1 5 could determine the conditions necessary to completely denature the double-stranded 
nucleic acid. Preferably the temperature is from 80 ^'C to 100 ^'C, more preferably the 
temperature is from 90 °C to 96 ^^C. other methods which may be used to denature the 
polynucleotides include pressure (36) and pH. 

20 The polynucleotides may be re-armealed by cooling. Preferably the temperature 

is from 20 to 75 °C, more preferably the temperature is from 40 "^C to 65 ''C. If a high 
frequency of crossovers is needed based on an average of only 4 consecutive bases of 
homology, recombination can be forced by using a low annealing temperature, although 
the process becomes more diflBcult. The degree of renaturation which occurs will depend 

25 on the degree of homology between the population of single-stranded polynucleotides. 

Renaturation can be accelerated by the addition of polyethylene glycol ("PEG") or 
salt. The salt concentration is preferably from 0 mM to 200 mM, more preferably the salt 
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concentration is from 10 mM to 100 mm. The salt may be KCl or NaCl. The 
concentration of PEG is preferably from 0% to 20%, more preferably from 5% to 10%. 

The annealed polynucleotides are next incubated in the presence of a nucleic acid 
5 polymerase and dNTP's (i.e. dATP, dCTP, DGTP and dTTP). The nucleic acid 
polymerase may be the Klenow fragment, the Taq polymerase or any other DNA 
polymerase known in the art. 

The approach to be used for the assembly depends on the minimum degree of 
1 0 homology that should still yield crossovers. If the areas of identity are large, Taq 

polymerase can be used with an aimealing temperature of between 45-65 °C. If the areas 
of identity are small, Klenow polymerase can be used with an aimealing temperature of 
between 20-30 ''C. One skilled in the art could vary the temperature of aimealing to 
increase the nimiber of cross-overs achieved. 

15 

The polymerase may be added to the random polynucleotides prior to annealing, 
simultaneoxisly with annealing or after annealing. 

The cycle of denaturation, renaturation and incubation in the presence of 
20 polymerase is referred to herein as shuffling or reassembly of the nucleic acid. This cycle 
is repeated for a desired number of times. Preferably the cycle is repeated from 2 to 50 
times, more preferably the sequence is repeated from 10 to 40 times. 

The resulting nucleic acid is a larger double-stranded polynucleotide of from 
25 about 50 bp to about 100 kb, preferably the larger polynucleotide is from 500 bp to 50 kb. 

This larger polynucleotides may contain a number of copies of a polynucleotide 
having the same size as the template polynucleotide in tandem. This concatemeric 
polynucleotide is then denatured into single copies of the template polynucleotide. The . 
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result will be a population of polynucleotides of approximately the same size as the 
template polynucleotide. The population will be a mixed population where single or 
double-stranded polynucleotides having an area of identity and an area of heterology 
have been added to the template polynucleotide prior to shuffling. These polynucleotides 
5 are then cloned into the appropriate vector and the ligation mixture used to transform 
bacteria. 

It is contemplated that the single polynucleotides may be obtained from the larger 
concatemeric polynucleotide by amplification of the single polynucleotide prior to 
10 cloning by a variety of methods including PGR (USPN 4,683,195 and USPN 4,683,202), 
rather than by digestion of the concatemer. 

The vector used for cloning is not critical provided that it will accept a 
polynucleotide of the desired size. If expression of the particular polynucleotide is 
15 desired, the cloning vehicle should further comprise transcription and translation signals 
next to the site of insertion of the polynucleotide to allow expression of the 
polynucleotide in the host cell. Preferred vectors include the pUC series and the pBR 
series of plasmids. 

20 The resulting bacterial population will include a number of recombinant 

polynucleotides having random mutations. This mixed population may be tested to 
identify the desired recombinant polynucleotides. The method of selection will depend 
on the polynucleotide desired. 

25 For example, if a polynucleotide which encodes a protein with increased binding 

efiBciency to a Ugand is desired, the proteins expressed by each of the portions of the 
polynucleotides in the population or library may be tested for their ability to bind to the 
ligand by methods known in the art (i.e. panning, affinity chromatography). If a 
polynucleotide which encodes for a protein with increased drug resistance is desired, the 
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proteins expressed by each of the polynucleotides in the population or library may be 
tested for their ability to confer drug resistance to the host organism. One skilled in the 
art, given knowledge of the desired protein, could readily test the population to identify 
polynucleotides which confer the desired properties onto the protein. 

5 

It is contemplated that one skilled in the art could use a phage display system in 
which fragments of the protein are expressed as fixsion proteins on the phage surface 
(Pharmacia, Milwaukee WI). The recombinant DNA molecules are cloned into the phage 
DNA at a site which results in the transcription of a fusion protein a portion of which is 

1 0 encoded by the recombinant DNA molecule. The phage containing the recombinant 
nucleic acid molecule undergoes replication and transcription in the cell. The leader 
sequence of the fusion protein directs the transport of the fusion protein to the tip of the 
phage particle. Thus the fusion protein which is partially encoded by the recombinant 
DNA molecule is displayed on the phage particle for detection and selection by the 

15 methods described above. 

It is further contemplated that a number of cycles of nucleic acid shuffling may be 
conducted with polynucleotides from a sub-population of the first population, which sub- 
population contains DNA encoding the desired recombinant protein. In this manner, 
20 proteins with even higher binding affinities or enzymatic activity could be achieved. 

It is also contemplated that a number of cycles of nucleic acid shuffling may be 
conducted with a mixture of wild-type polynucleotides and a sub-population of nucleic 
acid from the first or subsequent rounds of nucleic acid shuffling in order to remove any 
25 silent mutations from the sub-population. 

Any source of nucleic acid, in purified form can be utilized as the starting nucleic 
acid. Thus the process may employ DNA or RNA including messenger RNA, which 
DNA or RNA may be single or double stranded. In addition, a DNA-RNA hybrid which 
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contains one strand of each may be utilized. The nucleic acid sequence may be of 
various lengths depending on the size of the nucleic acid sequence to be mutated. 
Preferably the specific nucleic acid sequence is from 50 to 50000 base pairs. It is 
contemplated that entire vectors containing the nucleic acid encoding the protein of 
5 interest may be used in the methods of this invention. 

The nucleic acid may be obtained firom any so\irce, for example, firom plasmids 
such a pBR322, firom cloned DNA or RNA or firom natural DNA or RNA firom any 
source including bacteria, yeast, viruses and higher organisms such as plants or animals. 

10 DNA or RNA may be extracted firom blood or tissue material. The template 

polynucleotide may be obtained by ampHfication usmg the polynucleotide chain reaction 
(PGR, see USPN 4,683,202 and USPN 4,683,195). Alternatively, the polynucleotide 
may be present in a vector present in a cell and sufficient nucleic acid may be obtained by 
culturing the cell and extracting the nucleic acid fi-om the cell by methods known in the 

15 art. 

Any specific nucleic acid sequence can be used to produce the population of 
hybrids by the present process. It is only necessary that a small population of hybrid 
sequences of the specific nucleic acid sequence exist or be created prior to the present 
20 process. 



The initial small population of the specific nucleic acid sequences having 
mutations may be created by a nimiber of different methods. Mutations may be created 
by error-prone PGR. Error-prone PGR uses low-fidelity polymerization conditions to 
25 introduce a low level of point mutations randomly over a long sequence. Alternatively, 
mutations can be introduced into the template polynucleotide by oUgonucleotide-directed 
mutagenesis. In oligonucleotide-directed mutagenesis, a short sequence of the 
polynucleotide is removed fi-om the polynucleotide using restriction enzyme digestion 
and is replaced with a synthetic polynucleotide in which various bases have been altered 
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from the original sequence. The polynucleotide sequence can also be altered by chemical 
mutagenesis. Chemical mutagens include, for example, sodium bisulfite, nitrous acid, 
hydroxylamine, hydrazine or formic acid, other agents which are analogues of nucleotide 
precursors include nitrosoguanidine, 5-bromouracil, 2-aminopurine, or acridine, 
5 Generally, these agents are added to the PGR reaction in place of the nucleotide precursor 
thereby mutating the sequence. Intercalating agents such as proflavine, acriflavine, 
quinacrine and the like can also be used. Random mutagenesis of the polynucleotide 
sequence can also be achieved by irradiation with X-rays or ultraviolet light. Generally, 
plasmid polynucleotides so mutagenized are introduced into E. colt and propagated as a 
1 0 pool or library of hybrid plasmids. 

Alternatively the small mixed population of specific nucleic acids may be found 
in nature in that they may consist of different alleles of the same gene or the same gene 
from different related species (i.e., cognate genes). Alternatively, they may be related 
15 DNA sequences found within one species, for example, the immunoglobulin genes. 



Once the mixed population of the specific nucleic acid sequences is generated, the 
polynucleotides can be used directly or inserted into an appropriate cloning vector, using 
techniques well-known in,the art. 

20 

The choice of vector depends on the size of the polynucleotide sequence and the 
host cell to be employed in the methods of this invention. The templates of this invention 
may be plasmids, phages, cosmids, phagemids, viruses (e.g., retroviruses, 
parainfluenzavirus, herpesviruses, reoviruses, paramyxoviruses, and the like), or selected 
25 portions thereof (e.g., coat protein, spike glycoprotein, capsid protein). For example, 
cosmids and phagemids are preferred where the specific nucleic acid sequence to be 
mutated is larger because these vectors are able to stably propagate large polynucleotides. 
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If the mixed population of the specific nucleic acid sequence is cloned into a 
vector it can be clonally amplified by inserting each vector into a host cell and allowing 
the host cell to amplify the vector. This is referred to as clonal amplification because 
while the absolute number of nucleic acid sequences increases, the number of hybrids 
5 does not increase. Utility can be readily determined by screening expressed polypeptides. 

The DNA shufiling method of this invention can be performed blindly on a pool 
of unknown sequences. By adding to the reassembly mixture oligonucleotides (with ends 
that are homologous to the sequences being reassembled) any sequence mixture can be 

1 0 incorporated at any specific position into another sequence mixture. Thus, it is 

contemplated that mixtures of synthetic oligonucleotides, PCR polynucleotides or even 
whole genes can be mixed into another sequence hbrary at defmed positions. The 
insertion of one sequence (mixture) is independent from the insertion of a sequence in 
another part of the template. Thus, the degree of recombination, the homology required, 

1 5 and the diversity of the library can be independently and simultaneously varied along the 
length of the reassembled DNA. 



This approach of mixing two genes may be usefiil for the himianization of 
antibodies from murine hybridomas. The approach of mixing two genes or inserting 
20 alternative sequences into genes may be usefiil for any therapeutically used protein, for 
example, interleukin I, antibodies, tPA and growth hormone. The approach may also be 
usefiil in any nucleic acid for example, promoters or introns or 31 untranslated region or 
51 untranslated regions of genes to increase expression or alter specificity of expression 
of proteins. The approach may also be used to mutate ribozymes or aptamers. 

25 

Shuffling requires the presence of homologous regions separating regions of 
diversity. Scaffbld-like protein structures may be particularly suitable for shuffling. The 
conserved scaffold determines the overall folding by self-association, while displaying 
relatively unrestricted loops that mediate the specific binding. Examples of such 
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scafiFolds are the immunogiobulin beta-bairel, and the four-helix bundle which are well- 
known in the art. This shuflQing can be used to create scafFold-like proteins with various 
combinations of mutated sequences for binding. 

5 

Jn vitro Shuffling 

The equivalents of some standard genetic matings may also be performed by 
shuffling in vitro. For example, a "molecular backcross" can be performed by repeatedly 

10 mixing the hybrid's nucleic acid with the wild-type nucleic acid while selecting for the 
mutations of interest. As in traditional breeding, this approach can be used to combine 
phenotypes from diflFerent sources into a background of choice. It is useful, for example, 
for the removal of neutral mutations that aflfect unselected characteristics (i.e. 
immunogenicity). Thus it can be useful to determine which mutations in a protein are 

15 involved in the enhanced biological activity and which are not, an advantage which 
cannot be achieved by error-prone mutagenesis or cassette mutagenesis methods. 

Large, functional genes can be assembled correctly fit)m a mixture of small 
random polynucleotides. This reaction may be of use for the reassembly of genes from 
20 the highly fragmented DNA of fossils. In addition random nucleic acid fragments from 
fossils may be combined with polynucleotides from similar genes from related species. 

It is also contemplated that the method of this invention can be used for the in 
vitro amplification of a whole genome from a single cell as is needed for a variety of 

25 research and diagnostic applications. DNA amplification by PGR is in practice limited to 
a length of about 40 kb. Amplification of a whole genome such as that ofE. coli (5, 000 
kb) by PGR would require about 250 primers yieldmg 125 forty kb polynucleotides. This 
approach is not practical due to the unavailability of sufficient sequence data. On the 
other hand, random production of polynucleotides of the genome with sexual PGR cycles, 

30 followed by gel purification of small polynucleotides will provide a multitude of possible 
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primers. Use of this mix of random small polynucleotides as primers in a PCR reaction 
alone or with the whole genome as the template should result in an inverse chain reaction 
with the theoretical endpoint of a single concatamer containing many copies of the 
genome. 

5 

100 fold amplification in the copy number and an average polynucleotide size of 
greater than 50 kb may be obtained when only random polynucleotides are used. It is 
thought that the larger concatamer is generated by overlap of many smaller 
polynucleotides. The quality of specific PCR products obtained using synthetic primers 
10 will be indistinguishable fi-om the product obtained from unamplified DNA. It is 
expected that this approach will be useful for the mapping of genomes. 

The polynucleotide to be shufiQed can be produced as random or non-random 
polynucleotides, at the discretion of the practitioner. Moreover, this invention provides a 
15 method of shufl^ing that is applicable to a wide range of polynucleotide sizes and types, 
mcluding the step of generating polynucleotide monomers to be used as building blocks 
in the reassembly of a larger polynucleotide. For example, the building blocks can be 
fragments of genes or they can be comprised of entire genes or gene pathways, or any 
combination thereof. 

20 

In vivo Shuffling 

In an embodiment of m vivo shuffling, the mixed population of the specific 
nucleic acid sequence is introduced into bacterial or eukaiyotic cells under conditions 
25 such that at least two different nucleic acid sequences are present in each host cell. The 
polynucleotides can be mtroduced into the host cells by a variety of different methods. 
The host cells can be transformed with the smaller polynucleotides using methods known 
in the art, for example treatment with calcium chloride. If the polynucleotides are 
inserted into a phage genome, the host cell can be transfected v^th the recombinant phage 
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genome having the specific nucleic acid sequences. Alternatively, the nucleic acid 
sequences can be introduced into the host cell using electroporation, transfection, 
lipofection, biolistics, conjugation, and the like. 

5 In general, in this embodiment, the specific nucleic acids sequences will be 

present in vectors which are capable of stably replicating the sequence in the host cell. In 
addition, it is contemplated that the vectors will encode a marker gene such that host cells 
having the vector can be selected. This ensures that the mutated specific nucleic acid 
sequence can be recovered after introduction into the host cell. However, it is 

10 contemplated that the entire mixed population of the specific nucleic acid sequences need 
not be present on a vector sequence. Rather only a suflBcient nimiber of sequences need 
be cloned into vectors to ensure that after introduction of the polynucleotides into the host 
cells each host cell contains one vector having at least one specific nucleic acid sequence 
present therein. It is also contemplated that rather than having a subset of the population 

15 of the specific nucleic acids sequences cloned into vectors, this subset may be already 
stably integrated into the host cell. 

It has been found that when two polynucleotides which have regions of identity 
are inserted into the host cells homologous recombination occurs between the two ^ 
20 polynucleotides. Such recombination between the two mutated specific nucleic acid 
sequences will result in the production of double or triple hybrids in some situations. 

It has also been found that the frequency of recombination is increased if some of 
the mutated specific nucleic acid sequences are present on linear nucleic acid molecules. 
25 Therefore, in a preferred embodiment, some of the specific nucleic acid sequences are 
present on linear polynucleotides. 

After transformation, the host cell transformants are placed under selection to 
identify those host cell transformants which contain mutated specific nucleic acid 
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sequences having the qualities desired. For example, if increased resistance to a 
particular drug is desired then the transformed host cells may be subjected to increased 
concentrations of the particular drug and those transformants producing mutated proteins 
able to confer increased drug resistance will be selected. If the enhanced ability of a 
5 particular protein to bind to a receptor is desired, then expression of the protein can be 
induced from the transformants and the resulting protein assayed in a ligand binding 
assay by methods known in the art to identify that subset of the mutated population which 
shows enhanced binding to the ligand. Ahematively, the protein can be expressed in 
another system to ensure proper processing. 

10 

Once a subset of the first recombined specific nucleic acid sequences (daughter 
sequences) having the desired characteristics are identified, they are then subject to a 
second round of recombination. 

15 In the second cycle of recombination, the recombined specific nucleic acid 

sequences may be mixed with the original mutated specific nucleic acid sequences 
(parent sequences) and the cycle repeated as described above. In this way a set of second 
recombined specific nucleic acids sequences can be identified which have enhanced 
characteristics or encode for proteins having enhanced properties. This cycle can be 

20 repeated a number of times as desired. 

It is also contemplated that in the second or subsequent recombination cycle, a 
backcross can be performed. A molecular backcross can be performed by mixing the 
desired specific nucleic acid sequences with a large number of the wild-type sequence, 
25 such that at least one wild-type nucleic acid sequence and a mutated nucleic acid 

sequence are present in the same host cell after transformation. Recombination with the 
wild-type specific nucleic acid sequence will eUminate those neutral mutations that may 
affect unselected characteristics such as immunogenicity but not the selected 
characteristics. 
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In another embodiment of this invention, it is contemplated that during the first 
roxmd a subset of the specific nucleic acid sequences can be generated as smaller 
polynucleotides by slowing or halting their PGR ampUfication prior to introduction into 
5 the host cell. The size of the polynucleotides must be large enough to contain some 
regions of identity with the other sequences so as to homologously recombine with the 
other sequences. The size of the polynucleotides will range from 0,03 kb to 100 kb more 
preferably from 0. 2 kb to 10 kb. It is also contemplated that in subsequent rounds, all of 
the specific nucleic acid sequences other than the sequences selected from the previous 
1 0 round may be utilized to generate PGR polynucleotides prior to introduction into the host 
cells. 



The shorter polynucleotide sequences can be single-stranded or double-stranded. 
If the sequences were originally single-stranded and have become double-stranded they 
15 can be denatured with heat, chemicals or enzymes prior to insertion into the host cell. 
The reaction conditions suitable for separating the strands of nucleic acid are well known 
in the art. 

The steps of this process can be repeated indefinitely, being limited only by the 
20 number of possible hybrids which can be achieved. After a certain nimiber of cycles, all 
possible hybrids will have been achieved and fiirther cycles are redundant. 

In an embodiment the same mutated template nucleic acid is repeatedly 
recombined and the resulting recombinants selected for the desired characteristic. 

25 Therefore, the initial pool or population of mutated template nucleic acid is 

cloned into a vector capable of replicating in a bacteria such as E. colL The particular 
vector is not essential, so long as it is capable of autonomous replication in E. coli. In a 
preferred embodiment, the vector is designed to allow the expression and production of 
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any protein encoded by the mutated specific nucleic acid linked to the vector. It is also 
preferred that the vector contain a gene encoding for a selectable marker. 

The population of vectors containing the pool of mutated nucleic acid sequences 
5 is introduced into the E, coli host cells. The vector nucleic acid sequences may be 
introduced by transformation, transfection or infection in the case of phage. The 
concentration of vectors used to transform the bacteria is such that a nxmiber of vectors is 
introduced into each cell. Once present in the cell, the efficiency of homologous 
recombination is such that homologous recombination occurs between the various 
10 vectors. This results in the generation of hybrids (daughters) having a combination of 
mutations which differ firom the original parent mutated sequences. 

The host cells are then clonally replicated and selected for the marker gene 
present on the vector. Only those cells having a plasmid will grow imder the selection. 

15 

The host cells which contain a vector are then tested for the presence of favorable 
mutations. Such testing may consist of placing the cells under selective pressure, for 
example, if the gene to be selected is an improved drug resistance gene. If the vector 
allows expression of the protein encoded by the mutated nucleic acid sequence, then such 
20 selection may include allowing expression of the protein so encoded, isolation of the 
protein and testing of the protein to determine whether, for example, it binds with 
increased efficiency to the ligand of interest. 



Once a particular daughter mutated nucleic acid sequence has been identified 
25 which confers the desired characteristics, the nucleic acid is isolated either akeady linked 
to the vector or separated from the vector. This nucleic acid is then mixed with the first 
or parent population of nucleic acids and the cycle is repeated. 
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It has been shown that by this method nucleic acid sequences having enhanced 
desired properties can be selected. 

In an alternate embodiment, the first generation of hybrids are retained in the cells 
and the parental mutated sequences are added again to the cells. Accordingly, the first 
5 cycle of Embodiment I is conducted as described above. However, after the daughter 
nucleic acid sequences are identified, the host cells containing these sequences are 
retained. 

The parent mutated specific nucleic acid population, either as polynucleotides or 
1 0 cloned into the same vector is introduced into the host cells akeady containing the 
daughter nucleic acids. Recombination is allowed to occur in the cells and the next 
generation of recombinants, or granddaughters are selected by the methods described 
above. 

15 This cycle can be repeated a number of times until the nucleic acid or peptide 

having the desired characteristics is obtained. It is contemplated that in subsequent 
cycles, the population of mutated sequences which are added to the preferred hybrids 
may come firom the parental hybrids or any subsequent generation. 

20 In an alternative embodiment, the invention provides a method of conducting a 

"molecular" backcross of the obtained recombinant specific nucleic acid in order to 
eliminate any neutral mutations. Neutral mutations are those mutations which do not 
confer onto the nucleic acid or peptide the desired properties. Such mutations may 
however confer on the nucleic acid or peptide undesirable characteristics. Accordingly, it 

25 is desirable to eliminate such neutral mutations. The method of this invention provide a 
means of doing so. 

In this embodiment, after the hybrid nucleic acid, having the desired 
characteristics, is obtained by the methods of the embodiments, the nucleic acid, the 
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vector having the nucleic acid or the host cell containing the vector and nucleic acid is 
isolated. 

The nucleic acid or vector is then introduced into the host cell with a large excess 
5 of the wild-type nucleic acid. The nucleic acid of the hybrid and the nucleic acid of the 
wild-type sequence are allowed to recombine. The resulting recombinants are placed 
under the same selection as the hybrid nucleic acid. Only those recombinants which 
retained the desired characteristics will be selected. Any silent mutations which do not 
provide the desired characteristics will be lost through recombination with the wild-type 
1 0 DNA. This cycle can be repeated a number of times until all of the silent mutations are 
eliminated. 

Thus the methods of this invention can be used in a molecular backcross to 
eliminate unnecessary or silent mutations. 

15 

2.11 .2 ^ EXONUCLEASE-MEDIATED REASSEMBLY 

In a particular embodiment, this invention provides for a method for shuffling, 
20 assembling, reassembling, recombining, &/or concatenating at least two polynucleotides 
to form a progeny polynucleotide (e.g. a chimeric progeny polynucleotide that can be 
expressed to produce a polypeptide or a gene pathway). In a particular embodiment, a 
double stranded polynucleotide end (e.g. two single stranded sequences hybridized to 
each other as hybridization partners) is treated with an exonuclease to liberate nucleotides 
25 from one of the two strands, leaving the remaining strand free of its original partner so 
that, if desired, the remaining strand may be used to achieve hybridization to another 
partner. 

In a particular aspect, a double stranded polynucleotide end (that may be part of - 
30 or connected to - a polynucleotide or a nonpolynucleotide sequence) is subjected to a 
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source of exonuclease activity. Serviceable sources of exonuclease activity may be an 
enzyme with 3' exonuclease activity, an enzyme with 5' exonuclease activity, an enzyme 
with both 3' exonuclease activity and 5' exonuclease activity, and any combination 
thereof. An exonuclease can be used to liberate nucleotides from one or both ends of a 
5 linear double stranded polynucleotide, and from one to all ends of a branched 

polynucleotide having more than two ends. The mechanism of action of this liberation is 
believed to be comprised of an enzymatically-catalyzed hydrolysis of terminal 
nucleotides, and can be allowed to proceed in a time-dependent fashion, allowing 
experimental control of the progression of the enzymatic process. 

10 

By contrast, a non-enzymatic step may be used to shuffle, assemble, reassemble, 
recombine, and/or concatenate polynucleotide building blocks that is comprised of 
subjecting a working sample to denaturing (or "melting") conditions (for example, by 
changing temperature, pH, and /or salinity conditions) so as to melt a working set of 

15 double stranded polynucleotides into single polynucleotide strands. For shuffling, it is 
desirable that the single polynucleotide strands participate to some extent in annealment 
with different hybridization partners (i.e. and not merely revert to exclusive reannealment 
between what were former partners before the denaturation step). The presence of the 
former hybridization partners in the reaction vessel, however, does not preclude, and may 

20 sometimes even favor, reanneahnent of a single stranded polynucleotide with its fomier 
partner, to recreate an original double stranded polynucleotide. 

In contrast to this non-enzymatic shuffling step comprised of subjecting double 
stranded polynucleotide building blocks to denaturation, followed by annealment, the 
25 instant invention further provides an exonuclease-based approach requiring no 

denaturation - rather, the avoidance of denaturing conditions and the maintenance of 
double stranded polynucleotide substrates in annealed (i.e. non-denatured) state are 
necessary conditions for the action of exonucleases (e.g., exonuclease HI and red alpha 
gene product). Additionally in contrast, the generation of single stranded polynucleotide 
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sequences capable of hybridizing to other single stranded polynucleotide sequences is the 
result of covalent cleavage - and hence sequence destruction - in one of the hybridization 
partners. For example, an exonuclease HI enzyme may be used to enzymatically liberate 
3' terminal nucleotides in one hybridization strand (to achieve covalent hydrolysis in that 
5 polynucleotide strand); and this favors hybridization of the remaining single strand to a 
new partner (since its former partner was subjected to covalent cleavage). 

By way of further illustration, a specific exonuclease, namely exonuclease HI is 
provided herein as an example of a 3' exonuclease; however, other exonucleases may 

10 also be used, including enzymes with 5' exonuclease activity and eiizymes with 3' 
exonuclease activity, and including enzymes not yet discovered and enzymes not yet 
developed. It is particularly appreciated that enzymes can be discovered, optimized (e.g. 
engineered by directed evolution), or both discovered and optimized specifically for the 
instantly disclosed approach that have more optimal rates &/or more highly specific 

15 activities &/or greater lack of imwanted activities. In fact it is expected that the instant 
invention may encourage the discovery &/or development of such designer enzymes. In 
sum, this invention may be practiced with a variety of cxirrently available exonuclease 
enzyxiies, as well enzymes not yet discovered and enzymes not yet developed. 

20 The exonuclease action of exonuclease EH requires a working double stranded 

polynucleotide end that is either blunt or has a 5' overhang, and the exonuclease action is 
comprised of enzymatically liberating 3' terminal nucleotides, leaving a single stranded 
5' end that becomes longer and longer as the exonuclease action proceeds (see Figure 1). 
Any 5' overhangs produced by this approach may be used to hybridize to another single 

25 stranded polynucleotide sequence (which may also be a single stranded polynucleotide or 
a terminal overhang of a partially double stranded polynucleotide) that shares enough 
homology to allow hybridization. The ability of these exonuclease Hi-generated single 
stranded sequences (e.g. in 5' overhangs) to hybridize to other single stranded sequences 
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allows two or more polynucleotides to be shuffled, assembled, reassembled, &/or 
concatenated. 

Furthermore, it is appreciated that one can protect the end of a double stranded 
5 polynucleotide or render it susceptible to a desired enzymatic action of a serviceable 
exonuclease as necessary. For example, a double stranded polynucleotide end having a 
3' overhang is not susceptible to the exonuclease action of exonuclease EH. However, it 
may be rendered susceptible to the exonuclease action of exonuclease III by a variety of 
means; for example, it may be blunted by treatment with a polymerase, cleaved to 
10 provide a blunt end or a 5' overhang, joined (ligated or hybridized) to another double 
stranded polynucleotide to provide a blunt end or a 5' overhang, hybridized to a single 
stranded polynucleotide to provide a blunt end or a 5* overhang, or modified by any of a 
variety of means). 

15 According to one aspect, an exonuclease may be allowed to act on one or on both 

ends of a linear double stranded polynucleotide and proceed to completion, to near 
completion, or to partial completion. When the exonuclease action is allowed to go to 
completion, the result will be that the length of each 5 ' overhang will be extend far 
towards the middle region of the polynucleotide in the direction of what might be 

20 considered a "rendezvous point" (which may be somewhere near the polynucleotide 
midpoint). Ultimately, this results in the production of single stranded polynucleotides 
(that can become dissociated) that are each about half the length of the original double 
stranded polynucleotide (see Figure 1). Alternatively, an exonuclease-mediated reaction 
can be terminated before proceeding to completion. 

25 

Thus this exonuclease-mediated approach is serviceable for shufiling, assembling 
&/or reassembling, recombining, and concatenating polynucleotide building blocks, 
which polynucleotide buildmg blocks can be up to ten bases long or tens of bases long or 
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hundreds of bases long or thousands of bases long or tens of thousands of bases long or 
hundreds of thousands of bases long or millions of bases long or even longer. 

This exonuclease-mediated approach is based on the action of double stranded 
5 DNA specific exodeoxyribonuclease activity of E, coli exonuclease HI. Substrates for 
exonuclease EI may be generated by subjecting a double stranded polynucleotide to 
fi-agmentation. Fragmentation may be achieved by mechanical means (e.g., shearing, 
sonication, etc.), by enzymatic means (e.g. using restriction enzymes), and by any 
combination thereof Fragments of a larger polynucleotide may also be generated by 
10 polymerase-mediated synthesis. 

Exonuclease III is a 28K monomeric enzyme, product of the xthA gene of coli 
with four known activities: exodeoxyribonuclease (alternatively referred to as 
exonuclease herein), RNaseH, DNA-3' -phosphatase, and AP endonuclease. The 

15 exodeoxyribonuclease activity is specific for double stranded DNA. The mechanism of 
action is thought to involve enzymatic hydrolysis of DNA firom a 3* end progressively 
towards a 5* direction, with formation of nucleoside 5 '-phosphates and a residual single 
strand. The enzyme does not display eflBcient hydrolysis of single stranded DNA, single- 
stranded RNA, or double-stranded RNA; however it degrades RNA in an DNA-RNA 

20 hybrid releasing nucleoside 5'-phosphates. The enzyme also releases inorganic 

phosphate specifically firom 3*phosphomonoester groups on DNA, but not fi-om RNA or 
short oligonucleotides. Removal of these groups converts the terminus into a primer for 
DNA polymerase action. 

25 Additional examples of enzymes with exonuclease activity include fed-alpha and 

venom phosphodiesterases. Red alpha {redd) gene product (also referred to as lambda 
exonuclease) is of bacteriophage X origin. The reda gene is transcribed fi-om the leftward 
promoter and its product is involved (24 kD) in recombination. Red alpha gene product 
acts processively fi'om 5*-phosphorylated termini to Uberate mononucleotides firom 
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duplex DNA (Takahashi & Kobayashi, 1990). Venom phosphodiesterases (Laskowski, 
1980) is capable of rapidly opening supercoiled DNA. 



5 2.11,2.3. NON-STOCHASTIC LIGATION REASSEMBLY 

In one aspect, the present invention provides a non-stochastic method termed 
synthetic ligation reassembly (SLR), that is somewhat related to stochastic shuflQing, save 
that the nucleic acid building blocks are not shuffled or concatenated or chimerized 
10 randomly, but rather are assembled non-stochastically. 

A particularly glaring difference is that the instant SLR method does not depend 
on the presence of a high level of homology between polynucleotides to be shuffled. In 
contrast, prior methods, particularly prior stochastic shuffling methods require that 

1 5 presence of a high level of homology, particularly at coupling sites, between 

polynucleotides to be shuffled. Accordingly these prior methods favor the regeneration 
of the original progenitor molecules, and are suboptimal for generating large numbers of 
novel progeny chimeras, particularly full-length progenies. The instant invention, on the 
other hand, can be used to non-stochastically generate libraries (or sets) of progeny 

20 molecules comprised of over 10^^° different chimeras. Conceivably, SLR can even be 
used to generate libraries comprised of over 10^^^ different progeny chimeras with (no 
upper limit in sight). 

Thus, in one aspect, the present invention provides a method, which method is 
25 non-stochastic, of producing a set of finalized chimeric nucleic acid molecules having an 
overall assembly order that is chosen by design, which method is comprised of the steps 
of generating by design a plurality of specific nucleic acid building blocks having 
serviceable mutually compatible Ugatable ends, and assembling these nucleic acid 
building blocks, such that a designed overall assembly order is achieved. 
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The mutually compatible ligatable ends of the nucleic acid building blocks to be 
assembled are considered to be "serviceable" for this type of ordered assembly if they 
enable the building blocks to be coupled in predetermined orders. Thus, in one aspect, 
5 the overall assembly order in which the nucleic acid building blocks can be coupled is 
specified by the design of the ligatable ends and, if more than one assembly step is to be 
used, then the overall assembly order in which the nucleic acid building blocks can be 
coupled is also specified by the sequential order of the assembly step(s). Figure 4, Panel 
C illustrates an exemplary assembly process comprised of 2 sequential steps to achieve a 
10 designed (non-stochastic) overall assembly order for five nucleic acid building blocks. In 
a preferred embodiment of this invention, the annealed building pieces are treated with an 
enzyme, such as a ligase (e.g. T4 DNA ligase), achieve covalent bonding of the building 
pieces. 

15 In a preferred embodiment, the design of nucleic acid building blocks is obtained 

upon analysis of the sequences of a set of progenitor nucleic acid templates that serve as a 
basis for producing a progeny set of finalized chimeric nucleic acid molecules. These 
progenitor nucleic acid templates thus serve as a source of sequence information that aids 
in the design of the nucleic acid building blocks that are to be mutagenized, i.e. 

20 chimerized or shuffled. 

In one exemplification, this invention provides for the chimerization of a family 
of related genes and their encoded family of related products. In a particular 
exemplification, the encoded products are enzymes. As a representative Ust of families 
25 of enzymes which may be mutagenized in accordance with the aspects of the present 
invention, there may be mentioned, the following enzymes and their fimctions: 
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1 Lipase/Esterase 

a. Enantioselective hydrolysis of esters (lipids)/ thioesters 

1 ) Resolution of racemic mixtures 

2) Synthesis of optically active acids or alcohols from meso-diestcrs 

b. Selective syntheses 

1) Regiospecific hydrolysis of carbohydrate esters 

2) Selective hydrolysis of cyclic secondary alcohols 

c. Synthesis of optically active esters, lactones, acids, alcohols 

1) Transesterification of activated/nonactivated esters 

2) Interesterification 

3) Optically active lactones from hydroxyesters 

4) Regio- and enantioselective ring opening of anhydrides 

d. Detergents 

e. Fat/Oil conversion 
f Cheese ripening 

2 Protease 

a. Ester/amide synthesis 

b. Peptide synthesis 

c. Resolution of racemic mixtures of amino acid esters 

d. Synthesis of non-natural amino acids 

e. Detergents/protein hydrolysis 

3 Glycosidase/Glycosyl transferase 

a. Sugar/polymer synthesis 

b. Cleavage of glycosidic Unkages to form mono, di-and oligosaccharides 

c. Synthesis of complex oligosaccharides 

d. Glycoside synthesis using UDP-galactosyl transferase 

e. Transglycosylation of disaccharides, glycosyl fluorides, aryl galactosides 
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f. Glycosyl transfer in oligosaccharide synthesis 

g. Diastereoselective cleavage of p-glucosylsulfoxides 

h. Asymmetric glycosylations 

i. Food processing 
5 j. Paper processing 

4 Phosphatase/Kinase 

a. Synthesis/hydrolysis of phosphate esters 

1) Regio-, enantioselective phosphorylation 
10 2) hitroduction of phosphate esters 

3) Synthesize phospholipid precursors 

4) Controlled polynucleotide synthesis 

b. Activate biological molecule 

c. Selective phosphate bond formation without protecting groups 

15 

5 Mono/Dioxygenase 

a. Direct oxyfimctionalization of imactivated organic substrates 

b. Hydroxylation of alkane, aromatics, steroids 

c. Epoxidation of alkenes 

20 d. Enantioselective sulphoxidation 

e. Regio- and stereoselective Bayer-Wliger oxidations 

6 Haloperoxidase 

a. Oxidative addition of haUde ion to nucleophilic sites 
25 b. Addition of hypohalous acids to olefinic bonds 

c. Ring cleavage of cyclopropanes 

d. Activated aromatic substrates converted to ortho and para derivatives 

e. 1 .3 diketones converted to 2-halo-derivatives 
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f. Heteroatom oxidation of sulfur and nitrogen containing substrates 

g. Oxidation of enol acetates, alkynes and activated aromatic rings 

7 Lignin peroxidase/Diarylpropane peroxidase 

5 a. Oxidative cleavage of C-C bonds 

b. Oxidation of benzylic alcohols to aldehydes 

c. Hydroxylation of benzylic carbons 

d. Phenol dimerization 

e. Hydroxylation of double bonds to form diols 
10 f. Cleavage of lignin aldehydes 

8 Epoxide hydrolase 

a. Synthesis of enantiomerically pure bioactive compounds 

b. Regio- and enantioselective hydrolysis of epoxide 

15 c. Aromatic and olefinic epoxidation by monooxygenases to form epoxides 

d. Resolution of racemic epoxides 

e. Hydrolysis of steroid epoxides 

9 Nitrile hydratase/nitrilase 

20 • a. Hydrolysis ofaliphaticnitriles to carboxamides 

b. Hydrolysis of aromatic, heterocyclic, unsaturated aliphatic nitriles to 
corresponding acids 

c. Hydrolysis of acrylonitrile 

d. Production of aromatic and carboxamides, carboxylic acids (nicotinamide, 
25 picolinamide, isonicotinamide) 

e. Regioselective hydrolysis of acrylic dinitrile 

f. a-amino acids from a-hydroxynitriles 
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10 Transaminase 

a. Transfer of amino groups into oxo-acids 

5 11 Amidase/Acylase 

a. Hydrolysis of amides, amidines, and other C-N bonds 

b. Non-natural amino acid resolution and synthesis 

These exemplifications, while illustrating certain specific aspects of the invention, 
10 do not portray the limitations or circumscribe the scope of the disclosed invention. 

Thus according to one aspect of this invention, the sequences of a plurality of 
progenitor nucleic acid templates are aligned in order to select one or more demarcation 
points, which demarcation points can be located at an area of homology, and are 
15 comprised of one or more nucleotides, and which demarcation points are shared by at 
least two of the progenitor templates. The demarcation points can be used to delineate 
the boundaries of nucleic acid building blocks to be generated. Thus, the demarcation 
points identified and selected in the progenitor molecules serve as potential chimerization 
points in the assembly of the progeny molecules. 

20 

Preferably a serviceable demarcation point is an area of homology (comprised of 
at least one homologous nucleotide base) shared by at least two progenitor templates. 
More preferably a serviceable demarcation point is an area of homology that is shared by 
at least half of the progenitor templates. More preferably still a serviceable demarcation 
25 point is an area of homology that is shared by at least two thirds of the progenitor 
templates. Even more preferably a serviceable demarcation points is an area of 
homology that is shared by at least three fourths of the progenitor templates. Even more 
preferably still a serviceable demarcation points is an area of homology that is shared by 
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at almost all of the progenitor templates. Even more preferably still a serviceable 
demarcation point is an area of homology that is shared by all of the progenitor templates. 

The process of designing nucleic acid building blocks and of designing the 
5 mutually compatible ligatable ends of the nucleic acid building blocks to be assembled is 
illustrated in Figures 6 and 7. As shown, the aUgnment of a set of progenitor templates 
reveals several naturally occurring demarcation points, and the identification of 
demarcation points shared by these templates helps to non-stochastically determine the 
building blocks to be generated and used for the generation of the progeny chimeric 
10 molecules. 

In a preferred embodiment, this invention provides that the hgation reassembly 
process is performed exhaustively in order to generate an exhaustive library. In other 
words, all possible ordered combinations of the nucleic acid building blocks are 
1 5 represented in the set of finalized chimeric nucleic acid molecules. At the same time, in a 
particularly preferred embodiment, the assembly order (i.e. the order of assembly of each 
building block in the 5' to 3 sequence of each fiinaUzed chimeric nucleic acid) in each 
combination is by design (or non-stochastic). Because of the non-stochastic nature of this 
invention, the possibihty of unwanted side products is greatly reduced. 

20 

In another preferred embodiment, this invention provides that, the ligation 
reassembly process is performed systematically, for example m order to generate a 
systematically compartmentahzed library, with compartments that can be screened 
systematically, e.g. one by one. In other words this invention provides that, through the 
25 selective and judicious use of specific nucleic acid building blocks, coupled with the 
selective and judicious use of sequentially stepped assembly reactions, an experimental 
design can be achieved where specific sets of progeny products are made in each of 
several reaction vessels. This allows a systematic examination and screening procedure 
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to be performed. Thus, it allows a potentially very large number of progeny molecules to 
be examined systematically in smaller groups. 

Because of its ability to perform chimerizations in a maimer that is highly flexible 
5 yet exhaustive and systematic as well, particularly when there is a low level of homology 
among the progenitor molecules, the instant invention provides for the generation of a 
library (or set) comprised of a large number of progeny molecules. Because of the non- 
stochastic nature of the instant ligation reassembly invention, the progeny molecules 
generated preferably comprise a library of finalized chimeric nucleic acid molecules 

1 0 having an overall assembly order that is chosen by design. In a particularly preferred 
embodiment of this invention, such a generated library is comprised of preferably greater 
than 10^ different progeny molecular species, more preferably greater than 10^ different 
progeny molecular species, more preferably still greater than 10^^ different progeny 
molecular species, more preferably still greater than 10^^ different progeny molecular 

1 5 species, more preferably still greater than 1 0^^ different progeny molecular species, more 
preferably still greater than 10^^ different progeny molecular species, more preferably 
still greater than 1 0^ different progeny molecular species, more preferably still greater 
than 10^^ different progeny molecular species, more preferably still greater than 10^ 
different progeny molecular species, more preferably still greater than 10^^ different 

20 progeny molecular species, more preferably still greater than 1 0^^ different progeny 
molecular species, more preferably still greater than 10^*^ different progeny molecular 
species, more preferably still greater than 10*'° different progeny molecular species, more 
preferably still greater than 10^^° different progeny molecular species, more preferably 
still greater than 10*^° different progeny molecular species, more preferably still greater 

25 than 10^^ different progeny molecular species, more preferably still greater than 10*^° 
different progeny molecular species, more preferably still greater than 10*^^ different 
progeny molecular species, more preferably still greater than 10^^ different progeny 
molecular species, more preferably still greater than 10^°*^ different progeny molecular 
species, more preferably still greater than 10*^ different progeny molecular species, more 
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preferably still greater than 10 diflferent progeny molecular species, and even more 
preferably still greater than 10^*^ different progeny molecular species. 

In one aspect, a set of finalized chimeric nucleic acid molecules, produced as 
5 described is comprised of a polynucleotide encoding a polypeptide. According to one 
preferred embodiment, this polynucleotide is a gene, which may be a man-made gene. 
According to another preferred embodiment, this polynucleotide is a gene pathway, 
which may be a man-made gene pathway. This invention provides that one or more man- 
made genes generated by this invention may be incorporated into a man-made gene 
10 pathway, such as pathway operable in a eukaiyotic organism (including a plant). 

It is appreciated that the power of this invention is exceptional, as there is much 
fi-eedom of choice and control regarding the selection of demarcation points, the size and 
number of the nucleic acid building blocks, and the size and design of the couplings. It is 

15 appreciated, furthermore, that the requirement for intermolecular homology is highly 
relaxed for the operabihty of this invention. In fact, demarcation points can even be 
chosen in areas of little or no intermolecular homology. For example, because of codon 
wobble, i.e. the degeneracy of codons, nucleotide substitutions can be introduced into 
nucleic acid building blocks without altering the amino acid originally encoded in the 

20 corresponding progenitor template. Alternatively, a codon can be altered such that the 
coding for an originally amino acid is altered. This inventiop provides that such 
substitutions can be introduced into the nucleic acid building block in order to increase 
the incidence of intermolecularly homologous demarcation points and th\is to allow an 
increased number of couplings to be achieved among the building blocks, which in turn 

25 allows a greater number of progeny chimeric molecules to be generated. 

In another exemplifaction, the synthetic nature of the step in which the building 
blocks are generated allows the design and introduction of nucleotides (e.g. one or more 
nucleotides, which may be, for example, codons or introns or regulatory sequences) that 
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can later be optionally removed in an in vitro process (e.g. by mutageneis) or in an in 
vivo process (e.g. by utilizing the gene splicing ability of a host organism). It is 
appreciated that in many instances the introduction of these nucleotides may also be 
desirable for many other reasons in addition to the potential benefit of creating a 
5 serviceable demarcation point. 

Thus, according to another embodiment, this invention provides that a nucleic 
acid building block can be used to introduce an intron. Thus, this invention provides that 
functional introns may be introduced into a man-made gene of this invention. This 
10 invention also provides that functional introns may be introduced into a man-made gene 
pathway of this invention. Accordingly, this invention provides for tiie generation of a 
chimeric polynucleotide that is a man-made gene containing one (or more) artificially 
introduced intron(s). 

15 Accordingly, this invention also provides for the generation of a chimeric 

polynucleotide that is a man-made gene pathway containing one (or more) artificially 
introduced intron(s). Preferably, the artificially introduced intron(s) are fimctional in one 
or more host cells for gene splicing much in the way that naturally-occurring introns 
serve functionally in gene splicing. This invention provides a process of producing man- 

20 made intron-containing polynucleotides to be introduced into host organisms for 
recombination and/or splicing. 

The ability to achieve chimerizations, using couplings as described herein, in 
areas of little or no homology among the progenitor molecules, is particularly useful, and 
25 in fact critical, for the assembly of novel gene pathways. This invention thus provides for 
the generation of novel man-made gene pathways using synthetic hgation reassembly. In 
a particular aspect, this is achieved by the introduction of regulatory sequences, such as 
promoters, that are operable in an intended host, to confer operability to a novel gene 
pathway when it is introduced into the intended host. In a particular exemphfication, this 
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invention provides for the generation of novel man-made gene pathways that is operable 
in a plurality of intended hosts (e.g. in a microbial organism as well as in a plant cell). 
This can be achieve, for example, by the introduction of a plurality of regulatory 
sequences, comprised of a regulatory sequence that is operable in a first intended host 
5 and a regulatory sequence that is operable in a second intended host. A similar process 
can be performed to achieve operability of a gene pathway in a third intended host 
species, etc. The number of intended host species can be each integer from 1 to 10 or 
alternatively over 10. Alternatively, for example, operability of a gene pathway in a 
plurality of intended hosts can be achieved by the introduction of a regulatory sequence 
10 having intrinsic operabihty in a plurahty of intended hosts. 

Thus, according to a particular embodiment, this invention provides that a nucleic 
acid building block can be used to introduce a regulatory sequence, particularly a 
regulatory sequence for gene expression. Preferred regulatory sequences include, but are 

15 not lunited to, those that are man-made, and those found in archeal, bacterial, eukaryotic 
(including mitochondrial), viral, and prionic or prion-like organisms. Preferred 
regulatory sequences include but are not limited to, promoters, operators, and activator 
binding sites. Thus, this invention provides that functional regulatory sequences may be 
introduced into a man-made gene of this invention. This invention also provides that 

20 functional regulatory sequences may be introduced into a man-made gene pathway of this 
invention. 

Accordingly, this invention provides for the generation of a chimeric 
polynucleotide that is a man-made gene containing one (or more) artificially introduced 
25 regulatory sequence(s). Accordingly, this invention also provides for the generation of a 
chimeric polynucleotide that is a man-made gene pathway containing one (or more) . 
artificially introduced regulatory sequence(s). Preferably, an artificially introduced 
regulatory sequence(s) is operatively linked to one or more genes in the man-made 
polynucleotide, and are functional in one or more host cells. 
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Preferred bacterial promoters that are serviceable for this invention include lad, 
lacZ, T3, T7, gpt, lambda Pr, Pl and trp. Serviceable eukaryotic promoters include CMV 
munediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
5 mouse metallothionein-L Particular plant regulatory sequences include promoters active 
in directing transcription in plants, either constitutively or stage and/or tissue specific, 
depending on the use of the plant or parts thereof These promoters include, but are not 
limited to promoters showing constitutive expression, such as the 35S promoter of 
Cauliflower Mosaic Virus (GaMV) (Guilley et al., 1982), those for leaf-specific 

10 expression, such as the promoter of the ribulose bisphosphate carboxylase small subunit 
gene (Coruzzi et al., 1984), those for root-specific expression, such as the promoter from 
the glutamin synthase gene (Tingey et al., 1987), those for seed-specific expression, such 
as the cruciferin A promoter from Brassica napus (Ryan et al., 1989), those for tuber- 
specific expression, such as the class-I patatin promoter from potato (Rocha-Sasa et al., 

15 1989; Wenzler et al., 1989) or those for firuit-specific expression, such as the 
polygalacturonase (PG) promoter from tomato (Bird et al., 1988). 

Other regulatory sequences that are preferred for this invention include terminator 
sequences and polyadenylation signals and any such sequence fimctioning as such in 

20 plants, the choice of which is within the level of the skilled artisan. An example of such 
sequences is the 3' flanking region of the nopaline synthase (nos) gene of Agrobacterium 
tumefaciens (Bevan, 1984). The regulatory sequences may also include enhancer 
sequences, such as foimd in the 35S promoter of CaMV, and mRNA stabilizing 
sequences such as the leader sequence of Alfalfa Mosaic Cirus (AIMV) RNA4 

25 (Brederode et al., 1980) or any other sequences fimctioning in a like manner. 

A man-made genes produced using this invention can also serve as a substrate for 
recombination with another nucleic acid. Likewise, a man-made gene pathway produced 
using this mvention can also serve as a substrate for recombination with another nucleic 
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acid. In a preferred instance, the recombination is facilitated by, or occurs at, areas of 
homology between the man-made intron-containing gene and a nucleic acid with serves 
as a recombination partner. In a particularly preferred instance, the recombination partner 
may also be a nucleic acid generated by this invention, including a man-made gene or a 
5 man-made gene pathway. Recombination may be faciUtated by or may occxir at areas of 
homology that exist at the one (or more) artificially introduced intron(s) in the man-made 
gene. 

The synthetic ligation reassembly method of this invention utilizes a plurality of 
10 nucleic acid building blocks, each of which preferably has two hgatable ends. The two 
Ugatable ends on each nucleic acid building block may be two blunt ends (i.e. each 
having an overhang of zero nucleotides), or preferably one blunt end and one overhang, 
or more preferably still two overhangs. 

15 A serviceable overhang for this purpose may be a 3' overhang or a 5' overhang. 

Thus, a nucleic acid building block may have a 3' overhang or alternatively a 5' overhang 
or alternatively two 3' overhangs or alternatively two 5' overhangs. The overall order in 
which the nucleic acid building blocks are assembled to form a finalized chimeric nucleic 
acid molecule is determined by purposefiil experimental design and is not random. 

20 

According to one preferred embodiment, a nucleic acid building block is 
generated by chemical synthesis of two single-stranded nucleic acids (also referred to as 
single-stranded oligos) and contacting them so as to allow them to anneal to form a 
double-stranded nucleic acid building block. 

25 

A double-stranded nucleic acid building block can be of variable size. The sizes 
of these building blocks can be small or large depending on the choice of the 
experimenter. Preferred sizes for building block range from 1 base pair (not including 
any overhangs) to 100,000 base pairs (not including any overhangs). Other preferred size 
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ranges are also provided, which have lower limits of from 1 bp to 10,000 bp (including 
every integer value in between), and upper limits of from 2 bp to 100, 000 bp (including 
every integer value in between). 

5 It is appreciated that current methods of polymerase-based amplification can be 

used to generate double-stranded nucleic acids of up to thousands of base pairs, if not 
tens of thousands of base pairs, in length with high fidelity. Chemical synthesis (e.g. 
phosphoramidite-based) can be used to generate nucleic acids of up to hundreds of 
nucleotides in length with high fidelity; however, these can be assembled, e.g. using 
10 overhangs or sticky ends, to form double-stranded nucleic acids of up to thousands of 
base pairs, if not tens of thousands of base pairs, in length if so desired. 

A combination of methods (e.g. phosphoramidite-based chemical synthesis and 
PCR) can also be used according to this invention. Thus, nucleic acid building block 
15 made by different methods can also be used in combination to generate a progeny 
molecule of this invention. 

The \ise of chemical synthesis to generate nucleic acid building blocks is 
particularly preferred in this invention & is advantageous for other reasons as well, 
20 including procedural safety and ease. No cloning or harvesting or actual handling of any 
biological samples is required. The design of the nucleic acid building blocks can be 
accompUshed on paper. Accordingly, this invention teaches an advance in procedural 
safety in recombinant technologies. 

25 Nonetheless, according to one preferred embodiment, a double-stranded nucleic 

acid building block according to this invention may also be generated by polymerase- 
based amplification of a polynucleotide template. In a non-limiting exemplification, as 
illustrated in Figure 2, a first polymerase-based amplification reaction using a first set of 
primers, F2 and Ri, is used to generate a blunt-ended product (labeled Reaction 1, Product 
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1), which is essentially identical to Product A, A second polymerase-based amplification 
reaction using a second set of primers, Fi and R2, is used to generate a blunt-ended 
product (labeled Reaction 2, Product 2), which is essentially identical to Product B. 
These two products are mixed and allowed to melt and anneal, generating potentially 
useful double-stranded nucleic acid building blocks with two overhangs. In the example 
of Fig. 2, the product with the 3' overhangs (Product C) is selected by nuclease-based 
degradation of the other 3 products using a 3' acting exonuclease, such as exonuclease 
ni. It is appreciated that a 5' acting exonuclease (e.g. red alpha) may be also be used, for 
example to select Product D instead. It is also appreciated that other selection means can 
also be used, including hybridization-based means, and that these means can incorporate 
a further means, such as a magnetic bead-based means, to facilitate separation of the 
desired product. 

Many other methods exist by which a double-stranded nucleic acid building block 
15 can be generated that is serviceable for this invention; and these are known in the art and 
can be readily performed by the skilled artisan. 

According to particularly preferred embodiment, a double-stranded nucleic acid 
building block that is serviceable for this invention is generated by first generating two 

20 single stranded nucleic acids and allowing them to anneal to form a double-stranded 

nucleic acid building block. The two strands of a double-stranded nucleic acid building 
block may be complementary at every nucleotide apart from any that form an overhang; 
thus containing no mismatches, apart from any overhang(s). According to another 
embodiment, the two strands of a double-stranded nucleic acid building block are 

25 complementary at fewer than every nucleotide apart from any that form an overhang. 

Thus, according to this embodiment, a double-stranded nucleic acid building block can be 
used to introduce codon degeneracy. Preferably the codon degeneracy is introduced 
using the site-saturation mutagenesis described herein, using one or more N,N,G/T 
cassettes or alternatively using one or more N,N,N cassettes. 

- 556 - 



5 



10 



wo 00/46344 



PCTAJSOO/03086 



Contained within an exemplary experimental design for achieving an ordered, 
assembly according to this invention are: 



1) The design of specific nucleic acid building blocks. 

2) The design of specific ligatable ends on each nucleic acid building block. 

3) The design of a particular order of assembly of the nucleic acid building 

blocks. 



10 An overhang may be a 3 ' overhang or a 5 ' overhang. An overhang may also have 

a terminal phosphate group or alternatively may be devoid of a terminal phosphate group 
(having, e.g., a hydroxyl group instead). An overhang may be comprised of any number 
of nucleotides. Preferably an overhang is comprised of 0 nucleotides (as in a blunt end) to 
10,000 nucleotides. Thus, a wide range of overhang sizes may be serviceable. 

15 Accordingly, the lower limit may be each integer firom 1-200 and the upper limit may be 
each integer firom 2-10,000. According to a particular exemplification, an overhang may 
consist of anywhere firom 1 nucleotide to 200 nucleotides (including every integer value 
in between). 



20 The final chimeric nucleic acid molecule may be generated by sequentially 

assembling 2 or more building blocks at a time imtil all the designated building blocks 
have been assembled. A working sample may optionally be subjected to a process for 
size selection or purification or other selection or enrichment process between the 
performance of two assembly steps. Altematively, the final chimeric nucleic acid 

25 molecule may be generated by assembling all the designated building blocks at once in 
one step. 
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Utility 

The in vivo recombination method of this invention can be performed blindly on a 
pool of unknown hybrids or alleles of a specific polynucleotide or sequence. However, it 
is not necessary to know the actual DNA or RNA sequence of the specific 
5 polynucleotide. 

The approach of using recombination within a mixed population of genes can be 
usefiil for the generation of any useful proteins, for example, interleukin I, antibodies, 
tPA and growth hormone. This approach may be used to generate proteins having altered 

10 specificity or activity. The approach may also be useful for the generation of hybrid 
nucleic acid sequences, for example, promoter regions, introns, exons, enhancer 
sequences, 31 untranslated regions or 51 untranslated regions of genes. Thus this 
approach may be used to generate genes having increased rates of expression. This 
approach may also be useful in the study of repetitive DNA sequences. Finally, this 

15 approach may be useful to mutate ribozymes or aptamers. 



Scaffold-like regions separating regions of diversity in proteins may be 
particxilarly suitable for the methods of this invention. The conserved scaffold 
detennines the overall folding by self-association, while displaying relatively unrestricted 
20 loops that mediate the specific binding. Examples of such scaffolds are the 

immunoglobulin beta barrel, and the four-helix bundle. The methods of this invention 
can be used to create scaffold-Uke proteins with various combinations of mutated 
sequences for binding. 

25 The equivalents of some standard genetic matings may also be performed by the 

methods of this invention. For example, a "molecular" backcross can be performed by 
repeated mixing of the hybrid's nucleic acid with the wild-type nucleic acid while 
selecting for the mutations of interest. As in traditional breeding, this approach can be 
used to combine phenotypes firom different sources into a background of choice. It is 
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useful, for example, for the removal of neutral mutations that affect unselected 
characteristics (i.e. immimogenicity). Thus it can be useful to determine which mutations 
in a protein are involved in the enhanced biological activity and which are not. 

5 

2.11.2.4. END-SELECTION 

This invention provides a method for selecting a subset of polynucleotides from a 
starting set of polynucleotides, which method is based on the ability to discriminate one 

10 or more selectable features (or selection markers) present anywhere in a working 
polynucleotide, so as to allow one to perform selection for (positive selection) &/or 
against (negative selection) each selectable polynucleotide. In a preferred aspect, a 
method is provided termed end-selection, which method is based on the use of a selection 
marker located in part or entirely in a terminal region of a selectable polynucleotide, and 

1 5 such a selection marker may be termed an "end-selection marker". 

End-selection may be based on detection of naturally occurring sequences or on 
detection of sequences introduced experimentally (including by any mutagenesis 
procedure mentioned herein and not mentioned herein) or on both, even within the same 

20 polynucleotide. An end-selection marker can be a structural selection marker or a 
functional selection marker or both a structural and a functional selection marker. An 
end-selection marker may be comprised of a polynucleotide sequence or of a polypeptide 
sequence or of any chemical structure or of any biological or biochemical tag, including 
markers that can be selected using methods based on the detection of radioactivity, of 

25 enzymatic activity, of fluorescence, of any optical feature, of a magnetic property (e.g. 
using magnetic beads), of immunoreactivity, and of hybridization. 

End-selection may be appUed in combination with any method serviceable for 
performing mutagenesis. Such mutagenesis methods include, but are not limited to, 
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methods described herein (supra and infra). Such methods include, by way of non- 
limiting exemplification, any method that may be referred herein or by others in the art 
by any of the following terms: "saturation mutagenesis", "shuffling", "recombination", 
"re-assembly", "error-prone PGR", "assembly PGR", "sexual PGR", "crossover PGR", 
5 "oligonucleotide primer-directed mutagenesis", "recursive (&/or exponential) ensemble 
mutagenesis (see Arkin and Youvan, 1992)", "cassette mutagenesis", "in vivo 
mutagenesis", and "in vitro mutagenesis". Moreover, end-selection may be performed on 
molecules produced by any mutagenesis &/or amplification method (see, e.g., Arnold, 
1993; Caldwell and Joyce, 1992; Stammer, 1994; foUowmg which method it is desirable 
10 to select for (including to screen for the presence of) desirable progeny molecules. 

In addition, end-selection may be appUed to a polynucleotide apart from any 
mutagenesis method. Li a preferred embodiment, end-selection, as provided herein, can 
be used in order to facilitate a cloning step, such as a step of ligation to another 
15 polynucleotide (including ligation to a vector). This invention thus provides for end- 
selection as a serviceable means to faciUtate library construction, selection &/or 
enrichment for desirable polynucleotides, and cloning in general. 

In a particularly preferred embodiment, end-selection can be based on (positive) 
20 selection for a polynucleotide; alternatively end-selection can be based on (negative) 
selection against a polynucleotide; and alternatively still, end-selection can be based on 
both (positive) selection for, and on (negative) selection against, a polynucleotide. End- 
selection, along with other methods of selection &Joi screerung, can be performed in an 
iterative fashion, with any combination of like or unlike selection &/or screening 
25 methods and serviceable mutagenesis methods, all of which can be performed in an 
iterative fashion and in any order, combination, and permutation. 

It is also appreciated that, according to one embodiment of this invention, end- 
selection may also be used to select a polynucleotide is at least in part: circular (e.g. a 
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plasmid or any other circular vector or any other polynucleotide that is partly circular), 
&/or branched, &/or modified or substituted with any chemical group or moiety. In 
accord with this embodiment, a polynucleotide may be a circular molecule comprised of 
. an intermediate or central region, which region is flanked on a 5' side by a 5' flanking 
5 region (which, for the purpose of end-selection, serves in like manner to a 5* terminal 
region of a non-circular polynucleotide) and on a 3' side by a 3' terminal region (which, 
for the purpose of end-selection, serves in like manner to a 3' terminal region of a non- 
circular polynucleotide). As used in this non-limiting exemplification, there may be 
sequence overlap between any two regions or even among all three regions. 

10 

In one non-limiting aspect of this invention, end-selection of a linear 
polynucleotide is performed using a general approach based on the presence of at least 
one end-selection marker located at or near a polynucleotide end or terminus (that can be 
either a 5' end or a 3' end). In one particular non-limiting exemplification, end-selection 

15 is based on selection for a specific sequence at or near a terminus such as, but not limited 
to, a sequence recognized by an enzyme that recognizes a polynucleotide sequence. An 
enzyme that recognizes and catalyzes a chemical modification of a polynucleotide is 
referred to herein as a polynucleotide-acting enzyme. In a preferred embodiment, 
serviceable polynucleotide-acting enzymes are exemplified non-exclusively by enzymes 

20 with polynucleotide-cleaving activity, enzymes with polynucleotide-raethylating activity, 
enzymes with polynucleotide-ligating activity, and enzymes with a plurality of 
distinguishable enzymatic activities (including non-exclusively, e.g., both polynucleotide- 
cleaving activity and polynucleotide-ligating activity). 

25 Relevant polynucleotide-acting enzymes thus also include any commercially 

available or non-commercially available polynucleotide endonucleases and their 
companion methylases including those catalogued at the website 
http://www.neb.com/rebase, and those mentioned in the following cited reference 
(Roberts and Macelis, 1996). Preferred polynucleotide endonucleases include - but are 
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not limited to - type 11 restriction enzymes (including type nS), and include enzymes that 
cleave both strands of a double stranded polynucleotide (e.g. Not I, which cleaves both 
strands at 5 ' . . .GC/GGCCGC. . .3 ') and enzymes that cleave only one strand of a double 
stranded polynucleotide, i.e. enzymes that have polynucleotide-nicking activity, (e.g. N. 
5 I, which cleaves only one strand at 5 ' . . .GAGTCNNNN/N, . .3 Relevant 

polynucleotide-acting enzymes also include type HI restriction enzymes. 

It is appreciated that relevant polynucleotide-acting enzymes also include any 
enzymes that may be developed in the future, though currently unavailable, that are 
10 serviceable for generating a Ugation compatible end, preferably a sticky end, in a 
polynucleotide. 

In one preferred exemplification, a serviceable selection marker is a restriction 
site in a polynucleotide that allows a corresponding type n (or type nS) restriction 

15 enzyme to cleave an end of the polynucleotide so as to provide a ligatable end (including 
a blunt end or altematively a sticky end with at least a one base overhang) that is 
serviceable for a desirable ligation reaction without cleaving the polynucleotide internally 
in a manner that destroys a desired internal sequence in the polynucleotide. Thus it is 
provided that, among relevant restriction sites, those sites that do not occur intemally (i.e. 

20 that do not occur apart from the termini) in a specific working polynucleotide are 

preferred when the use of a corresponding restriction enzyme(s) is not intended to cut the 
working polynucleotide intemally. This allows one to perform restriction digestion 
reactions to completion or to near completion without incurring unwanted internal 
cleavage in a working polynucleotide. 

25 

According to a preferred aspect, it is thus preferable to use restriction sites that are 
not contained, or altematively that are not expected to be contained, or alternatively that 
unlikely to be contained (e.g. when sequence information regarding a working 
polynucleotide is incomplete) intemally in a polynucleotide to be subjected to end- 
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selection. In accordance with this aspect, it is appreciated that restriction sites that occur 
relatively infrequently are usually preferred over those that occur more frequently. On 
the other hand it is also appreciated that there are occasions where internal cleavage of a 
polypeptide is desired, e.g. to achieve recombination or other mutagenic procedures along 
5 with end-selection. 

In accord with this invention, it is also appreciated that methods (e.g. mutagenesis 
methods) can be used to remove imwanted internal restriction sites. It is also appreciated 
that a partial digestion reaction (i.e. a digestion reaction that proceeds to partial 

1 0 completion) can be used to achieve digestion at a recognition site in a terminal region 
while sparing a susceptible restriction site that occxurs internally in a polynucleotide and 
that is recognized by the same enzyme. In one aspect, partial digest are useful because it 
is appreciated that certain enzymes show preferential cleavage of the same recognition 
sequence depending on the location and environment in which the recognition sequence 

1 5 occurs. For example, it is appreciated that, while lambda DNA has 5 EcoR I sites, 
cleavage of the site nearest to the right terminus has been reported to occur 10 times 
faster than the sites in the middle of the molecule. Also, for example, it has been reported 
that, while Sac E has four sites on lambda DNA, the three clustered centrally in lambda 
are cleaved 50 times faster than the remaining site near the terminus (at nucleotide 

20 40,386). Simmaarily, site preferences have been reported for various enzymes by many 
investigators (e.g., Thomas and Davis, 1975; Forsblum et al, 1976; Nath and Azzolina, 
1981; Brown and Smith, 1977; Gingeras and Brooks, 1983; Kruger et al, 1988; 
Conrad and Topal, 1989; OUer et al, 1991; Topal, 1991; and Pein, 1991; to name but a 
few). It is appreciated that any empirical observations as well as any mechanistic 

25 xmderstandings of site preferences by any serviceable polynucleotide-acting enzymes, 
whether currently available or to be procured in the fiiture, may be serviceable in end- 
selection according to this invention. 
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It is also appreciated that protection methods can be used to selectively protect 
specified restriction sites (e.g. internal sites) against unwanted digestion by enzymes that 
would otherwise cut a working polypeptide in response to the presence of those sites; and 
that such protection methods include modifications such as methylations and base 
5 substitutions (e.g. U instead of T) that inhibit an unwanted enzyme activity. It is 

appreciated that there are limited numbers of available restriction enzymes that are rare 
enough (e.g. having very long recognition sequences) to create large (e.g. megabase-long) 
restriction firagments, and that protection approaches (e.g. by methylation) are serviceable 
for increasing the rarity of enzyme cleavage sites. The use ofM^Fnu II (mCGCG) to 
10 increase the apparent rarity of Not I ^proximately twofold is but one example among 
many (Qiang et al, 1990; Nelson et al, 1984; Maxam and Gilbert, 1980; Raleigh and 
Wilson, 1986). 

According to a preferred aspect of this invention, it is provided that, in general, 
15 the use of rare restriction sites is preferred. It is appreciated that, in general, the 

frequency of occurrence of a restriction site is determined by the number of nucleotides 
contained therein, as well as by the ambiguity of the base requirements contained therein. 
Thus, in a non-limiting exOTiplification, it is appreciated that, in general, a restriction site 
composed of, for example, 8 specific nucleotides (e.g. the Not I site or GC/GGCCGC, 
20 with an estimated relative occurrence of 1 in 4^, i.e. 1 in 65,536, random 8-mers) is 
relatively more infrequent than one composed of, for example, 6 nucleotides (e.g. the 
Smd I site or CCC/GGG, having an estimated relative occurrence of 1 in 4^, i.e. 1 in 
4,096, random 6-mers), which in turn is relatively more infrequent than one composed of, 
for example, 4 nucleotides (e.g. the Msp I site or C/CGG, having an estimated relative 
25 occurrence of 1 in 4^, i.e. 1 in 256, random 4-mers). Moreover, in another non-limiting 
exemplification, it is appreciated that, in general, a restriction site having no ambiguous 
(but only specific) base requirements (e.g. the Fin 1 site or GTCCC, having an estimated 
relative occurrence of 1 in 4^, i.e. 1 in 1024, random 5-mers) is relatively more infi-equent 
than one having an ambiguous W (where W = A or T) base requirement (e.g. the Ava U 
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site or G/GWCC, having an estimated relative occurrence of 1 in 4x4x2x4x4 - i.e. 1 in 
512 - random 5-mers), which in turn is relatively more infrequent than one having an 
ambiguous N (where N = A or C or G or T) base reqxiirement (e.g. the Asu I site or 
G/GNCC, havmg an estimated relative occurrence of 1 in 4x4x1x4x4, i.e. 1 in 256 - 
5 random 5-mers). These relative occurrences are considered general estimates for actual 
polynucleotides, because it is appreciated that specific nucleotide bases (not to mention 
specific nucleotide sequences) occur with dissimilar frequencies in specific 
polynucleotides, in specific species of organisms, and in specific groupings of organisms. 
For example, it is appreciated that the % G+C contents of different species of organisms 
10 are often very different and wide ranging. 

The use of relatively more infrequent restriction sites as a selection marker 
include - in a non-limiting fashion - preferably those sites composed at least a 4 
nucleotide sequence, more preferably those composed at least a 5 nucleotide sequence, 

15 more preferably still those composed at least a 6 nucleotide sequence (e.g. the BamA I 
site or G/GATCC, the Bgl n site or A/GATCT, the Pst I site or CTGCA/G, and the Xba I 
site or T/CTAGA), more preferably still those composed at least a 7 nucleotide sequence, 
more preferably still those composed of an 8 nucleotide sequence nucleotide sequence 
(e.g. the Asc I site or GG/CGCGCC, the lioi I site or GC/GGCCGC, the Pac I site or 

20 TTAAT/TAA, the Pme I site or GTTT/AAAC, the Srf\ site or GCCC/GGGC, the &e838 
I site or CCTGCA/GG, and the 5>va I site or ATTT/AAKT), more preferably still those 
composed of a 9 nucleotide sequence, and even more preferably still those composed of 
at least a 10 nucleotide sequence (e.g. the BspG I site or CG/CGCTGGAC). It is further 
appreciated that some restriction sites (e.g. for class IIS enzymes) are comprised of a 

25 portion of relatively high specificity (i.e. a portion containing a principal determinant of 
the frequency of occurrence of the restriction site) and a portion of relatively low 
specificity; and that a site of cleavage may or may not be contained within a portion of 
relatively low specificity. For example, in the Eco57 1 site or CTGAAG(16/14), there is a 
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portion of relatively high specificity (i.e. the CTGAAG portion) and a portion of 
relatively low specificity (i.e. the N16 sequence) that contains a site of cleavage. 

In another preferred embodiment of this invention, a serviceable end-selection 
5 marker is a terminal sequence that is recognized by a polynucleotide-acting enzyme that 
recognizes a specific polynucleotide sequence. In a preferred aspect of this invention, 
serviceable polynucleotide-acting enzymes also include other enzymes in addition to 
classic type II restriction enzymes. According to this preferred aspect of this invention, 
serviceable polynucleotide-acting enzymes also include gyrases, helicases, recombinases, 
10 relaxases, and any enzymes related thereto. 

Among preferred examples are topoisomerases (which have been categorized by 
some as a subset of the gyrases) and any other enzymes that have polynucleotide- 
cleaving activity (including preferably polynucleotide-nicking activity) &/or 

15 polynucleotide-Ugating activity. Among preferred topoisomerase enzymes are 

topoisomerase I enzymes, which is available firom many commercial sources (Epicentre 
Technologies, Madison, WI; Invitrogen, Carlsbad, CA; Life Technologies, Gathesburg, 
MD) and conceivably even more private sources. It is appreciated that similar enzymes 
may be developed in the fixture that are serviceable for end-selection as provided herein. 

20 A particularly preferred topoisomerase I enzyme is a topoisomerase I enzyme of vaccinia 
virus origin, that has a specific recognition sequence (e.g. 5' . . . AAGGG, . .3 ') and has 
both polynucleotide-nicking activity and polynucleotide-ligating activity. Due to the 
specific nicking-activity of this enzyme (cleavage of one strand), internal recognition 
sites are not prone to polynucleotide destruction resulting 5x>m the nicking activity (but 

25 rather remain annealed) at a temperature that causes denaturation of a terminal site that 
has been nicked. Thus for use in end-selection, it is preferable that a nicking site for 
topoisomerase-based end-selection be no more than 100 nucleotides from a terminus, 
more preferably no more than 50 nucleotides from a terminus, more preferably still no 
more than 25 nucloetides from a terminus, even more preferably still no more than 20 
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nucleotides from a terminus, even more preferably still no more than 15 nucleotides from 
a terminus, even more preferably still no more than 10 nucleotides from a terminus, even 
more preferably still no more than 8 nucleotides from a terminus, even more preferably 
still no more than 6 nucleotides from a terminus, and even more preferably still no more 
5 than 4 nucleotides from a terminus. 

In a particularly preferred exemplification that is non-limiting yet clearly 
illustrative, it is appreciated that when a nicking site for topoisomerase-based end- 
selection is 4 nucleotides from a terminus, nicking produces a single stranded oligo of 4 

10 bases (in a terminal region) that can be denatured from its complenaentary strand in an 
end-selectable polynucleotide; this provides a sticky end (comprised of 4 bases) in a 
polynucleotide that is serviceable for an ensuing hgation reaction. To accomplish 
ligation to a cloning vector (preferably an expression vector), compatible sticky ends can 
be generated in a cloning vector by any means including by restriction en2yme-based 

15 means. The terminal nucleotides (comprised of 4 terminal bases in this specific example) 
in an end-selectable polynucleotide terminus are thus wisely chosen to provide 
compatibility with a sticky end generated in a cloning vector to which the polynucleotide 
is to be Ugated. 

20 On the other hand, internal nickuig of an end-selectable polynucleotide, e.g. 500 

bases from a terminus, produces a single stranded oligo of 500 bases that is not easily 
denatured from its complementary strand, but rather is serviceable for repair (e.g. by the 
same topoisomerase enzyme that produced the nick). 

25 This invention thus provides a method - e.g. that is vaccinia topoisomerase-based 

&/or type n (or IIS) restriction endonuclease-based &/or type IE restriction 
endonuclease-based &/or nicking enzyme-based (e.g. using N. BstNB I) - for producing 
a sticky end in a working polynucleotide, which end is ligation compatible, and which 
end can be comprised of at least a 1 base overhang. Preferably such a sticky end is 
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comprised of at least a 2-base overhang, more preferably such a sticky end is comprised 
of at least a 3-base overhang, more preferably still such a sticky end is comprised of at 
least a 4-base overhang, even more preferably still such a sticky end is comprised of at 
least a 5-base overhang, even more preferably still such a sticky end is comprised of at 
5 least a 6-base overhang. Such a sticky end may also be comprised of at least a 7-base 
overhang, or at least an 8-base overhang, or at least a 9-base overhang, or at least a 10- 
base overhang, or at least 15-base overhang, or at least a 20-base overhang, or at least a 
25-base overhang, or at least a 30-base overhang. These overhangs can be comprised of 
any bases, including A, C, G, or T. 

10 

It is appreciated that sticky end overhangs introduced using topoisomerase or a 
nicking einzyme (e.g. using N. BstNB I) can be designed to be unique in a ligation 
environment, so as to prevent unwanted fragment reassemblies, such as self- 
dimerizations and other unwanted concatamerizations. 

15' 

According to one aspect of this invention, a plurality of sequences (which may but 
do not necessarily overlap) can be introduced into a terminal region of an end-selectable 
polynucleotide by the use of an oligo in a polymerase-based reaction. In a relevant, but 
by no means limiting example, such an oligo can be used to provide a preferred 5' 

20 terminal region that is serviceable for topoisomerase I-based end-selection, which oUgo is 
comprised of: a 1-10 base sequence that is convertible into a sticky end (preferably by a 
vaccinia topoisomerase I), a ribosome binding site (i.e. and "RBS", that is preferably 
serviceable for expression cloning), and optional linker sequence followed by an ATG 
start site and a template-specific sequence of 0-100 bases (to facilitate annealment to the 

25 template in the a polymerase-based reaction). Thus, according to this example, a 
serviceable oUgo (which may be termed a forward primer) can have the sequence: 
5'[terminal sequence = (N)i-io][topoisomerase I site & RBS = AAGGGAGGAG][Unker 
= (N)i.ioo][start codon and template-specific sequence = ATG(N)o-ioo]3'. 
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Analogously, in a relevant, but by no means limiting example, an oligo can be 
used to provide a preferred 3' terminal region that is serviceable for topoisomerase I- 
based end-selection, which oligo is comprised of: a 1-10 base sequence that is convertible 
into a sticky end (preferably by a vaccinia topoisomerase I), and optional linker sequence 
5 followed by a template-specific, sequence of 0-100 bases (to facilitate anneahnent to the 
template in the a polymerase-based reaction). Thus, according to this example, a 
serviceable oligo (which may be termed a reverse primer) can have the sequence: 
5'[terminal sequence = (N)Mo][topoisomerase I site = AAGGG][linker = (N)i- 
loo] [template-specific sequence == (N)o-ioo]3*. 

10 

It is appreciated that, end-selection can be used to distinguish and separate parental 
template molecules (e.g. to be subjected to mutagenesis) firom progeny molecules (e.g. 
generated by mutagenesis). For example, a first set of primers, lacking in a topoisomerase I 
recognition site, can be used to modify the terminal regions of the parental molecules (e.g. in 

15 polymerase-based ampUfication). A different second set of primers (e.g. having a 

topoisomerase I recognition site) can then be used to generate mutated progeny molecules 
(e.g. using any polynucleotide chimerization method, such as interrupted synthesis, template- 
switching polymerase-based amplification, or interrupted synthesis; or using saturation 
mutagenesis; or using any other method for introducing a topoisomerase I recognition site into 

20 a mutagenized progeny molecule as disclosed herein) fi-om the amplified template molecules. 
The use of topoisomerase I-based end-selection can then facilitate, not only discernment, but 
selective topoisomerase I-based ligation of the desired progeny molecules. 

Anneahnent of a second set of primers to thusly amphfied parental molecules can be 
25 facilitated by including sequences in a first set of primers (i.e. primers used for ampUfying a 
set parental molecules) that are similar to a toposiomerase I recognition site, yet different 
enough to prevent fimctional toposiomerase I enzyme recognition. For example, sequences 
that diverge from the AAGGG site by anywhere from 1 base to all 5 bases can be incorporated 
into a first set of primers (to be used for amplifying the parental templates prior to subjection 
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to mutagenesis). In a specific, but non-limiting aspect, it is thus provided that a parental 
molecule can be amplified using the following exemplary - but by no means limiting - set of 
forward and reverse primers: 

5 Forward Primer: 5' CTAGAAGAGAGGAGAAAACCATG(N)io.ioo 3', and 

Reverse Primer: 5' GATCAAAGGCGCGCCTGCAGG(N)io.ioo 3' 

According to this specific example of a first set of primers, (N)io-ioo represents 
preferably a 10 to 100 nucleotide-long template-specific sequence, more preferably a 10 to 50 
10 nucleotide-long template-specific sequence, more preferably still a 10 to 30 nucleotide-long 
template-specific sequence, and even more preferably still a 15 to 25 nucleotide-long 
template-specific sequence. 

According to a specific, but non-limiting aspect, it is thus provided that, after this 
1 5 amplification (using a disclosed first set of primers lacking in a true topoisomerase I 

recognition site), amplified parental molecules can then be subjected to mutagenesis using one 
or more sets of forward and reverse primers that do have a true topoisomerase I recognition 
site. In a specific, but non-limiting aspect, it is thus provided that a parental molecule can be 
used as templates for the generation of a mutagenized progeny molecule using the following 
20 exemplary - but by no means limiting - second set of forward and reverse primers: 

Forward Primer: 5' CTAGAAGGGAGGAGAAAACCATG 3' 
Reverse Primer: 5' GATCAAAGGCGCGCCTGCAGG 3' (contains Asc I 
recognition sequence) 

25 

It is appreciated that any number of different primers sets not specifically mentioned 
can be used as first, second, or subsequent sets of primers for end-selection consistent with 
this invention. Notice that type 11 restriction enzyme sites can be incorporated (e.g. diiAsc I 
site in the above example). It is provided that, in addition to the other sequences mentioned. 
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the experimentalist can incorporate one or more N,N,G/T triplets into a serviceable primer in 
order to subject a working polynucleotide to saturation mutagenesis. Sxmmiarily, use of a 
second and/or subsequent set of primers can achieve dual goals of introducing a 
topoisomerase I site and of generating mutations in a progeny polynucleotide. 

5 

Thus, according to one use provided, a serviceable end-selection marker is an 
enzyme recognition site that allows an enzyme to cleave (including nick) a 
polynucleotide at a specified site, to produce a ligation-compatible end upon denaturation 
of a generated single stranded oligo. Ligation of the produced polynucleotide end can 

10 then be accompUshed by the same enzyme (e.g. in the case of vacciiiia virus 

topoisomerase I), or alternatively with the use of a different enzyme. According to one 
aspect of this invention, any serviceable end-selection markers, whether like (e.g. two 
vaccinia virus topoisomerase I recognition sites) or unlike (e,g. a class 11 restriction 
enzyme recognition site and a vaccinia virus topoisomerase I recognition site) can be 

15 used in combination to select a polynucleotide. Each selectable polynucleotide can thus 
have one or more end-selection markers, and they can be like or unlike end-selection 
markers. In a particular aspect, a plurality of end-selection markers can be located on one 
end of a polynucleotide and can have overlapping sequences with each other. 

20 It is important to emphasize that any number of enzymes, whether currently in 

existence or to be developed, can be serviceable in end-selection according to this 
invention. For example, in a particular aspect of this invention, a nicking enzyme (e.g. N. 
BstNB I, which cleaves only one strand at 5\ . .GAGTCNNNN/N. , .3') can be used in 
conjunction with a source of polynucleotide-Ugating activity in order to achieve end- 

25 selection. According to this embodiment, a recognition site for N. ^^^NB I - instead of a 
recognition site for topoisomerase I - should be incorporated into an end-selectable 
polynucleotide (whether end-selection is \ised for selection of a mutagenized^ progeny 
molecule or whether end-selection is used apart firom any mutagenesis procedure). 
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It is appreciated that the instantly disclosed end-selection approach using topoisomerase- 
based nicking and ligation has several advantages over previously available selection 
methods. In sum, this approach allows one to achieve direction cloning (including 
expression cloning). Specifically, this approach can be used for the achievement of: 
5 direct ligation (i.e. without subjection to a classic restriction-purification-ligation 

reaction, that is susceptible to a multitude of potential problems from an initial restriction 
reaction to a ligation reaction dependent on the use of T4 DNA ligase); separation of 
progeny molecules from original template molecules (e.g. original template molecules 
lack topoisomerase I sites that not introduced until after mutagenesis), obviation of the 

10 need for size separation steps (e.g. by gel chromatography or by other electrophoretic 
means or by the use of size-exclusion membranes), preservation of internal sequences 
(even when topoisomerase I sites are present), obviation of concerns about unsuccessful 
ligation reactions (e.g. dependent on the use of T4 DNA ligase, particularly in the 
presence of unwanted residual restriction enzyme activity), and facilitated expression 

15 cloning (including obviation of fi-ame shift concems). Concerns about unwanted 

restriction enzyme-based cleavages - especially at internal restriction sites (or even at 
often unpredictable sites of unwanted star activity) in a working polynucleotide - that are 
potential sites of destruction of a working polynucleotide can also be obviated by the 
instantly disclosed end-selection approach using topoisomerase-based nicking and 

20 ligation. 

2.11 J. ADDITIONAL SCREENING METHODS 

25 Feptitf e Display Methods 

The present method can be used to shuffle, by in vitro and/or in vivo 
recombination by any of the disclosed methods, and in any combination, polynucleotide 
sequences selected by peptide display methods, wherein an associated polynucleotide 
encodes a displayed peptide which is screened for a phenotype (e.g., for affinity for a 
30 predetermined receptor (Ugand). 
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An increasingly important aspect of bio-pharmaceutical drug development and 
molecular biology is the identification of peptide structures, including the primary amino 
acid sequences, of peptides or peptidomimetics that interact with biological 
5 macromolecules. one method of identifying peptides that possess a desired structure or 
functional property, such as binding to a predetermined biological macromolecule (e.g., a 
receptor), involves the screening of a large library or peptides for individual library 
members which possess the desired structure or functional property conferred by the 
amino acid sequence of the peptide. 

10 

In addition to direct chemical synthesis methods for generating peptide libraries, 
several recombinant DNA methods also have been reported. One type involves the 
display of a peptide sequence, antibody, or other protein on the surface of a bacteriophage 
particle or cell. Generally, in these methods each bacteriophage particle or cell serves as 
15 an individual hbrary member displaying a single species of displayed peptide in addition 
to the natural bacteriophage or cell protein sequences. Each bacteriophage or cell 
contains the nucleotide sequence information encoding the particular displayed peptide 
sequence; thus, the displayed peptide sequence can be ascertained by nucleotide sequence 
determination of an isolated library member. 

20 

A well-known peptide display method involves the presentation of a peptide 
sequence on the surface of a filamentous bacteriophage, typically as a fusion with a 
bacteriophage coat protein. The bacteriophage library can be incubated vrith an 
immobilized, predetermined macromolecule or small molecule (e.g., a receptor) so that 
25 bacteriophage particles which present a peptide sequence that binds to the immobilized 
macromolecule can be differentially partitioned firom those that do not present peptide 
sequences that bind to the predetermined macromolecule. The bacteriophage particles 
(i.e., Ubrary members) which are bound to the immobilized macromolecule are then 
recovered and replicated to amplify the selected bacteriophage sub-population for a 
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subsequent round of affinity enrichment and phage replication. After several rounds of 
affinity enrichment and phage replication, the bacteriophage library members that are 
thus selected are isolated and the nucleotide sequence encoding the displayed peptide 
sequence is determined, thereby identifying the sequence(s) of peptides that bind to the 
5 predetermined macromolecule (e.g., receptor). Such methods are further described in 
PCT patent publications WO 91/17271, WO 91/18980, WO 91/19818 and WO 
93/08278. 

The latter PCT publication describes a recombinant DNA method for the display 
of peptide ligands that involves the production of a library of fusion proteins with each 
fusion protein composed of a first polypeptide portion, typically comprising a variable 
sequence, that is available for potential binding to a predetermined macromolecule, and a 
second polypeptide portion that binds to DNA, such as the DNA vector encoding the 
individual fusion protein. When transformed host cells are cultured under conditions that 
allow for expression of the fusion protein, the fusion protein binds to the DNA vector 
encoding it. Upon lysis of the host cell, the fusion protein/vector DNA complexes can be 
screened against a predetermined macromolecule in much the same way as bacteriophage 
particles are screened in the phage-based display system, with the repUcation and 
sequencing of the DNA vectors in the selected fusion protein/vector DNA complexes 
serving as the basis for identification of the selected library peptide sequence(s). 

Other systems for generating libraries of peptides and like polymers have aspects 
ofboth the recombinant and zwv/rro chemical synthesis methods. In these hybrid 
methods, cell-firee enzymatic machinery is employed to accompUsh the in vitro synthesis 
25 of the library members (i.e., peptides or polynucleotides). In one type of method, RNA 
molecules with the ability to bind a predetermined protein or a predetermined dye 
molecule were selected by alternate rounds of selection and PCR amplification (Ttaerk 
and Gold, 1990; Ellington and Szostak, 1990), A similar technique was used to 
identify DNA sequences which bind a predetermined human transcription factor 
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(Thiesen and Bach, 1990; Beaudry and Joyce, 1992; PCT patent publications WO 
92/05258 and WO 92/14843). In a similar fashion, the technique of in vitro translation 
has been used to synthesize proteins of interest and has been proposed as a method for 
generating large libraries of peptides. These methods which rely upon in vitro 
5 translation, generally comprising stabilized polysome complexes, are described further in 
PCT patent publications WO 88/08453, WO 90/05785, WO 90/07003, WO 91/02076, 
WO 91/05058, and WO 92/02536. Apphcants have described methods in which library 
members comprise a fusion protein having a first polypeptide portion with DNA binding 
activity and a second polypeptide portion having the library member unique peptide 
10 sequence; such methods are suitable for use in cell-free in vitro selection formats; among 
others. 

The displayed peptide sequences can be of varying lengths, typically from 3-5000 
amino acids long or longer, frequently from 5-100 amino acids long, and often from 

15 about 8-15 amino acids long. A library can comprise library members having varying 
lengths of displayed peptide sequence, or may comprise library members having a fixed 
length of displayed peptide sequence. Portions or all of the displayed peptide sequence(s) 
can be random, pseudorandom, defined set kemal, fixed, or the like. The present display 
methods include methods for in vitro and in vivo display of single-chain antibodies, such 

20 as nascent scFv on polysomes or scfv displayed on phage, which enable large-scale 
screening of scfv libraries having broad diversity of variable region sequences and 
binding specificities. 

The present invention also provides random, pseudorandom, and defined 
25 sequence fi-amework peptide libraries and methods for generating and screening those 
libraries to identify useful compounds (e.g., peptides, including single-chain antibodies) 
that bind to receptor molecules or epitopes of interest or gene products that modify 
peptides or RNA in a desired fashion. The random, pseudorandom, and defined sequence 
framework peptides are produced from libraries of peptide library members that comprise 
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displayed peptides or displayed single-chain antibodies attached to a polynucleotide 
template from which the displayed peptide was synthesized. The mode of attachment 
may vary according to the specific embodiment of the invention selected, and can include 
encapsulation in a phage particle or incoipbration in a cell. 

5 

A method of afifinity enrichment allows a very large library of peptides and 
single-chain antibodies to be screened and the polynucleotide sequence encoding the 
desired peptide(s) or single-chain antibodies to be selected. The polynucleotide can then 
be isolated and shuffled to recombine combinatorially the amino acid sequence of the 

1 0 selected peptide(s) (or predetermined portions thereof) or single-chain antibodies (or just 
VHI, VLI or CDR portions thereof). Using these methods, one can identify a pq)tide or 
single-chain antibody as having a desired binding affinity for a molecule and can exploit 
the process of shuffling to converge rapidly to a desired high-affinity peptide or scfv. The 
peptide or antibody can then be synthesized in bulk by conventional means for any 

15 suitable use (e.g., as a therapeutic or diagnostic agent). 

A significant advantage of the present invention is that no prior information 
regarding an expected ligand structure is required to isolate peptide ligands or antibodies 
of mterest. The peptide identified can have biological activity, which is meant to include 
20 at least specific binding afifmity for a selected receptor molecule and, in some instances, 
will further include the ability to block the binding of other compounds, to stimulate or 
mhibit metabolic pathways, to act as a signal or messenger, to stimulate or inhibit cellular 
activity, and the like. 

25 The present invention also provides a method for shuffling a pool of 

polynucleotide sequences selected by afifinity screening a library of polysomes displaymg 
nascent peptides (including single-chain antibodies) for library members which bind to a 
predetermined receptor (e.g., a mammalian proteinaceous receptor such as, for example, a 
peptidergic hormone receptor, a cell surface receptor, an intracellular protein which binds 
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to other protein(s) to form intracellular protein complexes such as hetero-dimers and the 
like) or epitope (e.g., an immobilized protein, glycoprotein, oligosaccharide, and the 
Uke). 

5 Polynucleotide sequences selected in a first selection round (typically by affinity 

selection for binding to a receptor (e.g., a Ugand)) by any of these methods are pooled 
and the pool(s) is/are shuffled by in vitro and/or in vivo recombination to produce a 
shuffled pool comprising a population of recombined selected polynucleotide sequences. 
The recombined selected polynucleotide sequences are subjected to at least one 
10 subsequent selection roxmd. The polynucleotide sequences selected in the subsequent 
selection roimd(s) can be used directly, sequenced, and/or subjected to one or more 
additional roimds of shuffling and subsequent selection. Selected sequences can also be 
back-crossed with polynucleotide sequences encoding neutral sequences (i.e., having 
insubstantial functional effect on binding), such as for example by back-crossing with a 
15 wild-type or naturally-occurring sequence substantially identical to a selected sequence to 
produce native-like functional peptides, which may be less immunogenic. Generally, 
during back-crossing subsequent selection is applied to retain the property of binding to 
the predetermined receptor (ligand). 

Prior to or concomitant with the shuffling of selected sequences, the sequences 
can be mutagenized. In one embodiment, selected library members are cloned in a 
prokaryotic vector (e.g., plasmid, phagemid, or bacteriophage) wherein a collection of 
individual colonies (or plaques) representing discrete library members are produced. 
Individual selected library members can then be manipulated (e.g., by site-directed 
mutagenesis, cassette mutagenesis, chemical mutagenesis, PGR mutagenesis, and the 
like) to generate a collection of library members representing a kemal of sequence 
diversity based on the sequence of the selected Ubrary member. The sequence of an 
individual selected library member or pool can be manipulated to incorporate random 
mutation, pseudorandom mutation, defined kemal mutation (i.e., comprising variant and 
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invariant residue positions and/or comprising variant residue positions which can 
comprise a residue selected from a defined subset of amino acid residues), codon-based 
mutation, and the like, either segmentally or over the entire length of the individual 
selected library member sequence. The mutagenized selected library members are then 
5 shuflQed by in vitro and/or in vivo recombinatorial shuffling as disclosed herein. 

The invention also provides peptide libraries comprising a plurality of individual 
library members of the invention, wherein (1) each individual library member of said 
plurality comprises a sequence produced by shuffling of a pool of selected sequences, and 
10 (2) each individual library member comprises a variable peptide segment sequence or 
single-chain antibody segment sequence which is distinct from the variable peptide 
segment sequences or single-chain antibody sequences of other individual library 
members in said plurality (although some library members may be present in more than 
one copy per library due to xmeven ampUfication, stochastic probability, or the like). 

15 

The invention also provides a product-by-process, wherein selected 
polynucleotide sequences having (or encoding a peptide having) a predetermined binding 
specificity are formed by the process of: (1) screening a displayed peptide or displayed 
single-chain antibody library against a predetermined receptor (e.g., ligand) or epitope 

20 (e.g., antigen macromolecule) and identifying and/or enriching library members which 
bind to the predetermined receptor or epitope to produce a pool of selected Ubrary 
members, (2) shuffling by recombination the selected library members (or amplified or 
cloned copies thereof) which binds the predetermined epitope and has been thereby 
isolated and/or enriched fix)m the library to generate a shuffled library, and (3) screening 

25 the shuffled library against the predetermined receptor (e.g., ligand) or epitope (e.g., 

antigen macromolecule) and identifying and/or enriching shuffled library members which 
bind to the predetermined receptor or epitope to produce a pool of selected shuffled 
library members. 
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Antihndy Display and Screening Methods 

The present method can be used to shuffle, by in vitro and/or in vivo 
recombination by any of the disclosed methods, and in any combmation, polynucleotide 
5 sequences selected by antibody display methods, wherein an associated polynucleotide 
encodes a displayed antibody which is screened for a phenotype (e.g., for affinity for 
binding a predetermined antigen (hgand). 

Various molecular genetic approaches have been devised to capture the vast 
1 0 immunological repertoire represented by the extremely large nxmiber of distinct variable 
regions which can be present in immunoglobulin chains. The naturally-occurring germ 
line immunoglobulin heavy chain locus is composed of separate tandem arrays of 
variable segment genes located upstream of a tandem array of diversity segment genes, 
which are themselves located upstream of a tandem array of joining (i) region genes, 
15 which are located upstream of the constant region genes. During B lymphocyte 

development, V-D-J rearrangement occurs wherein a heavy chain variable region gene 
(VH) is formed by rearrangement to form a fused D segment followed by rearrangement 
with a V segment to form a V-D-J joined product gene which, if productively rearranged, 
encodes a functional variable region (VH) of a heavy chain. Similarly, Ught chain loci 
20 rearrange one of several V segments with one of several J segments to form a gene 
encoding the variable region (VL) of a light chain. 

The vast repertoire of variable regions possible in immunoglobulins derives in 
part from the numerous combinatorial possibilities of joining V and i segments (and, in 
25 the case of heavy chain loci, D segments) during rearrangement in B cell development. 
Additional sequence diversity in the heavy chain variable regions arises from 
non-uniform rearrangements of the D segments during V-D-J joining and from N region 
addition. Further, antigen-selection of specific B cell clones selects for higher affinity 
variants having non-germUne mutations in one or both of the heavy and light chain 
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variable regions; a phenomenon referred to as "afiBnity maturation" or "afiSnity 
shaipening". Typically, these "affinity sharpening" mutations cluster in specific areas of 
the variable region, most commonly in the complementarity-determining regions (CDRs), 

5 In order to overcome many of the limitations in producing and identifying 

high-aJBfinity immimoglobulins through antigen-stimulated B cell development (i,e., 
immunization), various prokaryotic expression systems have been developed that can be 
manipulated to produce combinatorial antibody Ubraries which may be screened for 
high-affinity antibodies to specific antigens. Recent advances in the expression of 
10 antibodies in Escherichia coli and bacteriophage systems (see "alternative peptide display 
methods", infra) have raised the possibility that virtually any specificity can be obtained 
by either cloning antibody genes firom characterized hybridomas or by de novo selection 
using antibody gene Ubraries (e.g., from Ig cDNA). 

15 Combinatorial libraries of antibodies have been generated in bacteriophage 

lambda expression systems which may be screened as bacteriophage plaques or as 
colonies of lysogens (Huse et al, 1989); Caton and Koprowski, 1990; MuUinax et al, 
1990; Persson et al, 1991). Various embodiments of bacteriophage antibody display 
libraries and lambda phage expression libraries have been described (Kang et al, 1991; 

20 Clackson et al, 1991; McCafferty et al, 1990; Burton et al 1991; Hoogenboom et al 
1991; Chang et al 1991; Breitling et al 1991; Marks et al, 1991. p. 581; Barbas et al 
1992; Hawkins and Winter, 1992; Marks et al 1992, p. 779; Marks et al 1992, p. 
16007; and Lowman et al, 1991; Lemer et al, 1992; all incorporated herein by 
reference). Typically, a bacteriophage antibody display library is screened with a receptor 

25 (e.g., polypeptide, carbohydrate, glycoprotein, nucleic acid) that is immobilized (e.g., by 
covalent linkage to a chromatography resin to enrich for reactive phage by affinity 
chromatography) and/or labeled (e.g., to screen plaque or colony lifts). 
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One particularly advantageous approach has been the use of so-called single-chain 
fragment variable (scfv) libraries (Marks et al, 1992, p. 779; Winter and Milstein, 1991; 
Clackson et al 1991; Marks et al, 1991, p. 581; Chaudhary et al, 1990; Chiswell et al, 
1992; McCafferty et al, 1990; and Huston et al, 1988). Various embodiments of scfv 
5 libraries displayed on bacteriophage coat proteins have been described. 



Beginning in 1988, single-chain analogues of Fv fragments and their fusion 
proteins have been reliably generated by antibody engineering methods. The first step 
generally involves obtaining the genes encoding VH and VL domains with desired 

10 binding properties; these V genes may be isolated from a specific hybridoma cell Une, 
selected Gcom a combinatorial V-gene library, or made by V gene synthesis. The 
single-chain Fv is formed by connecting the component V genes with an oligonucleotide 
that encodes an appropriately designed linker peptide, such as (Gly-Gly-Gly-Gly-Ser)3 or 
equivalent linker peptide(s). The linker bridges the C-temiinus of the first V region and 

15 N-terminus of the second, ordered as either VH-lmker-VL or VL-linker-VH' In principle, 
the scfv binding site can faithfully replicate both the affinity and specificity of its parent 
antibody combining site. 

Thus, scfv fragments are comprised of VH and VL domains linked into a single 
20 polypeptide chain by a flexible linker peptide. After the scfv genes are assembled, they 
are cloned into a phagemid and expressed at the tip of the Ml 3 phage (or similar 
filamentous bacteriophage) as fusion protems with the bacteriophage PIQ (gene 3) coat 
protein. Enriching for phage expressing an antibody of interest is accompUshed by 
panning the recombinant phage displaying a population scfv for binding to a 
25 predetermined epitope (e.g., target antigen, receptor). 



The linked polynucleotide of a library member provides the basis for replication 
of the library member after a screening or selection procedure, and also provides the basis 
for the determination, by nucleotide sequencing, of the identity of the displayed peptide 
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sequence or VH and VL amino acid sequence. The displayed peptide (s) or single-chain 
antibody (e. g., scfv) and/or its VH and VL domains or their CDRs can be cloned and 
expressed in a suitable expression system. Often polynucleotides encoding the isolated 
VH and VL domains will be ligated to polynucleotides encoding constant regions (CH 
5 and CL) to form polynucleotides encoding complete antibodies (e.g., chimeric or 
fully-human), antibody fragments, and the like. Often polynucleotides encoding the 
isolated CDRs will be grafted into polynucleotides encoding a suitable variable region 
framework (and optionally constant regions) to form polynucleotides encoding complete 
antibodies (e.g., himianized or fully-human), antibody fragments, and the like. 
10 Antibodies can be used to isolate preparative quantities of the antigen by immunoafifinity 
chromatography. Various other uses of such antibodies are to diagnose and/or stage 
disease (e.g., neoplasia) and for therapeutic application to treat disease, such as for 
example: neoplasia, autoimmune disease, AIDS, cardiovascular disease, infections, and 
the like. 

15 

Various methods have been reported for increasing the combinatorial diversity of 
a scfv library to broaden the repertoire of binding species (idiotype spectrum) The use of 
PGR has permitted the variable regions to be rapidly cloned either from a specific 
hybridoma source or as a gene Ubrary from non-immxmized cells, affording combinatorial 

20 diversity in the assortment of VH and VL cassettes which can be combined. 

Furthermore, the VH and VL cassettes can themselves be diversified, such as by random, 
pseudorandom, or directed mutagenesis. Typically, VH and VL cassettes are diversified 
in or near the complementarity-determining regions (CDRS), often the third CDR, CDRS. 
Enzymatic inverse PGR mutagenesis has been shown to be a simple and reliable method 

25 for constructing relatively large libraries of scfv site-directed hybrids (Stemmer et al, 

1993), as has error-prone PGR and chemical mutagenesis (Deng et al 1994). Riechmann 
(Riechmann et al, 1993) showed semi-rational design of an antibody scfv fragment using 
site-directed randomization by degenerate oUgonucleotide PGR and subsequent phage 
display of the resultant scfv hybrids. Barbas (Barbas et al, 1992) attempted to circumvent 
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the problem of limited repertoire sizes resulting from using biased variable region 
sequences by randomizing the sequence in a synthetic CDR region of a human tetanus 
toxoid-binding Fab. 

5 CDR randomization has the potential to create approximately 1x10^^ CDRs for 

the heavy chain CDRS alone, and a roughly similar number of variants of the heavy chain 
CDRl and CDR2, and light chain CDRl-3 variants. Taken individually or together, the 
combination possibilities of CDR randomization of heavy and/or light chains requires 
generating a prohibitive number of bacteriophage clones to produce a clone library 
10 representing all possible combinations, the vast majority of which will be non-binding. 
Generation of such large niunbers of primary transforraants is not feasible with current 
transformation technology and bacteriophage display systems. For example, Barbas 
(Barbas et al, 1992) only generated 5 x 10^ transformants, which represents only a tiny 
fraction of the potential diversity of a library of thoroughly randomized CDRS. 

15 

Despite these substantial limitations, bacteriophage, display of scfv have akeady 
yielded a variety of usefiil antibodies and antibody fusion proteins. A bispecific single 
chain antibody has been shown to mediate eflBcient tumor cell lysis (Gruber et al, 1994). 
Intracellular expression of an anti-Rev scfv has been shown to inhibit HIV-1 virus 

20 replication in vitro (Duan et al, 1994), and intracellular expression of an anti-p2kar, scfv 
has been shown to inhibit meiotic maturation of Xenopus oocytes (Biocca et al, 1993). 
Recombinant scfv which can be used to diagnose HIV infection have also been reported, 
demonstrating the diagnostic utility of scfv (Lilley et al, 1994). Fusion proteins wherein 
an scFv is linked to a second polypeptide, such as a toxin or fibrinolytic activator protein, 

25 have also been reported (Holvost et al, 1 992; Nicholls et al, 1993). 



If it were possible to generate scfv Ubraries having broader antibody diversity and 
overcoming many of the limitations of conventional CDR mutagenesis and 
randomization methods which can cover only a very tiny fraction of the potential 
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sequence combinations, the number and quality of scfv antibodies suitable for therapeutic 
and diagnostic use could be vastly improved. To address this, the in vitro and in vivo 
shuffling methods of the invention are used to recombine CDRs which have been 
obtained (typically via PCR amplification or cloning) firom nucleic acids obtained fi-om 
5 selected displayed antibodies. Such displayed antibodies can be displayed on cells, on 
bacteriophage particles, on polysomes, or any suitable antibody display system wherein 
the antibody is associated with its encoding nucleic acid(s). In a variation, the CDRs are 
initially obtained from mRNA (or cDNA) from antibody-producing cells (e.g., plasma 
cells/splenocytes from an immunized wild-type mouse, a human, or a transgenic mouse 
10 capable of making a human antibody as in WO 92/03918, WO 93/12227, and WO 
94/25585), including hybridomas derived therefrom. 

Polynucleotide sequences selected in a first selection round (typically by affinity 
selection for displayed antibody binding to an antigen (e.g., a ligand) by any of these 

15 methods are pooled and the pool(s) is/are shuffled by in vitro and/or in vivo 

recombination, especially shxiflfling of CDRs (typically shuffling heavy chain CDRs with 
other heavy chain CDRs and Hght chain CDRs with other light chain CDRs) to produce a 
shuffled pool comprising a population of recombined selected polynucleotide sequences. 
The recombined selected polynucleotide sequences are expressed in a selection format as 

20 a displayed antibody and subjected to at least one subsequent selection round. The 
polynucleotide sequences selected in the subsequent selection roimd(s) can be used 
directly, sequenced, and/or subjected to one or more additional rounds of shuffling and 
subsequent selection until an antibody of the desired binding afSnity is obtained. 
Selected sequences can also be back-crossed with polynucleotide sequences encoding 

25 neutral antibody framework sequences (i.e., having insubstantial fimctional effect on 
antigen binding), such as for example by back-crossing with a human variable region 
framework to produce human-like sequence antibodies. Generally, during back-crossing 
subsequent selection is applied to retain the property of binding to the predetermined 
antigen. 
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Alternatively, or in combination with the noted variations, the valency of the 
target epitope may be varied to control the average binding affinity of selected scfv 
library members. The target epitope can be bound to a surface or substrate at varying 
5 densities, such as by including a competitor epitope, by dilution, or by other method 
known to those in the art. A high density (valency) of predetermined epitope can be used 
to enrich for scfv library members which have relatively low aflSnity, whereas a low 
density (valency) can preferentially enrich for higher aflBnity scfv library members. 

10 For generating diverse variable segments, a collection of synthetic 

oligonucleotides encoding random, pseudorandom, or a defined sequence kemal set of 
peptide sequences can be inserted by ligation into a predetermined site (e.g., a CDR). 
Similarly, the sequence diversity of one or more CDRs of the single-chain antibody 
cassette(s) can be expanded by mutating the CDR(s) with site-directed mutagenesis, 

15 CDR-replacement, and the like. The resultant DNA molecules can be propagated in a 
host for cloning and amplification prior to shuffling, or can be used directly (i.e., may 
avoid loss of diversity which may occur upon propagation in a host cell) and the selected 
library members subsequently shuffled. 

20 Displayed peptide/polynucleotide complexes (library members) which encode a 

variable segment peptide sequence of interest or a single-chain antibody of interest are 
selected firom the hbrary by an affinity eiuichment technique. This is accomplished by 
means of a immobilized macromolecule or epitope specific for the peptide sequence of 
interest, such as a receptor, other macromolecule, or other epitope species. Repeating the 

25 affmity selection procedure provides an enrichment of library members encoding the 

desired sequences, which may then be isolated for pooling and shufiOing, for sequencing, 
and/or for fiulher propagation and afiSnity enrichment. 
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The library members without the desired specificity are removed by washing. 
The degree and stringency of washing required will be determined for each peptide 
sequence or single-chain antibody of interest and the immobilized predetermined 
macromolecule or epitope. A certain degree of control can be exerted over the binding 
5 characteristics of the nascent peptide/DNA complexes recovered by adjusting the 

conditions of the binding incubation and the subsequent washing. The temperature, pH, 
ionic strength, divalent cations concentration, and the volume and duration of the 
washing will select for nascent peptide/DNA complexes within particular ranges of 
affinity for the immobilized macromolecule. Selection based on slow dissociation rate, 

1 0 which is usually predictive of high affinity, is often the most practical route. This may be 
done either by continued incubation in the presence of a saturating amount of firee 
predetermined macromolecule, or by increasing the volume, number, and length of the 
washes. In each case, the rebinding of dissociated nascent peptide/DNA or peptide/RNA 
complex is prevented, and with increasing time, nascent peptide/DNA or peptide/RNA 

15 complexes of higher and higher affinity are recovered. 

Additional modifications of the binding and washing procedures may be appUed 
to find peptides with special characteristics. The affinities of some peptides are 
dependent on ionic strength or cation concentration. This is a usefiil characteristic for 
20 peptides that will be used in affinity purification of various proteins when gentle 
conditions for removing the protein from the peptides are required. 

One variation involves the use of multiple binding targets (multiple epitope 
species, multiple receptor species), such that a scfv^ Ubrary can be simultaneously 
25 screened for a multiplicity of scfv which have different binding specificities. Given that 
the size of a scfv library often limits the diversity of potential scfv sequences, it is 
typically desirable to us scfv libraries of as large a size as possible. The time and 
economic considerations of generating a number of very large polysome scFv-display 
libraries can become prohibitive. To avoid this substantial problem, multiple 
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predetermined epitope species (receptor species) can be concomitantly screened in a 
single library, or sequential screening against a number of epitope species can be used. In 
one variation, multiple target epitope species, each encoded on a separate bead (or subset 
of beads), can be mixed and incubated with a polysome-display scfv library under 
5 suitable binding conditions. The collection of beads, comprising multiple epitope 
species, can then be used to isolate, by affinity selection, scfv library members. 
Generally, subsequent affinity screening rounds can include the same mixture of beads, 
subsets thereof, or beads containing only one or two individual epitope species. This 
approach affords efficient screening, and is compatible with laboratory automation, batch 
10 processing, and high tiiroughput screening methods. 

A variety of techniques can be used in the present invention to diversify a peptide 
library or single-chain antibody library, or to diversify, prior to or concomitant with 
shuffling, around variable segment peptides found in early roxmds of parming to have 

1 5 sufficient binding activity to the predetermined macromolecule or epitope. In one 

approach, the positive selected peptide/polynucleotide complexes (those identified in an 
early round of affinity enrichment) are sequenced to determine the identity of the active 
peptides. Oligonucleotides are then synthesized based on these active peptide sequences, 
employing a low level of all bases incorporated at each step to produce sligiht variations 

20 of the primary oligonucleotide sequences. This mixture of (slightly) degenerate 

oligonucleotides is then cloned into the variable segment sequences at the appropriate 
locations. This method produces systematic, controlled variations of the starting peptide 
sequences, which can then be shuffled. It requires, however, that individual positive 
nascent peptide/polynucleotide complexes be sequenced before mutagenesis, and thus is 

25 useful for expanding the diversity of small numbers of recovered complexes and selecting 
variants having higher binding affinity and/or higher binding specificity. In a variation, 
mutagenic PGR amplification of positive selected peptide/polynucleotide complexes 
(especially of the variable region sequences, the amplification products of which are 
shuffled in vitro and/or in vivo and one or more additional rounds of screening is done 
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prior to sequencing. The same general approach can be employed with single-chain 
antibodies in order to expand the diversity and enhance the binding affinity/specificity, 
typically by diversifying CDRs or adjacent framework regions prior to or concomitant 
with shuflfling. If desired, shuffling reactions can be spiked \yith mutagenic 
5 oligonucleotides capable of in vitro recombination with the selected library members can 
be included, Thus, mixtures of synthetic oligonucleotides and PGR produced 
polynucleotides (synthesized by error-prone or high-fidehty methods) can be added to the 
in vitro shuffling mix and be incorporated into resulting shuffled library members 
(shufiflants). 

10 . • 

The present invention of shuffling enables the generation of a vast library of 
CDR-variant single-chain antibodies. One way to generate such antibodies is to insert 
synthetic CDRs into the single-chain antibody and/or CDR randomization prior to or 
concomitant with shuffling. The sequences of the synthetic CDR cassettes are selected 

15 by referring to known sequence data of human CDR and are selected in the discretion of 
the practitioner according to the following guidelines: synthetic CDRs will have at least 
40 percent positional sequence identity to known CDR sequences, and preferably will 
have at least 50 to 70 percent positional sequence identity to known CDR sequences. For 
example, a collection of synthetic CDR sequences can be generated by synthesizing a 

20 collection of oligonucleotide sequences on the basis of naturally-occurring human CDR 
sequences Usted in Kabat (Kabat et al, 1991); the pool (s) of synthetic CDR sequences 
are calculated to encode CDR peptide sequences having at least 40 percent sequence 
identity to at least one known naturally-occurring hxmian CDR sequence. Alternatively, a 
collection of naturally-occurring CDR sequences may be compared to generate consensus 

25 sequences so that amino acids used at a residue position frequently (i.e., in at least 5 
percent of known CDR sequences) are incorporated into the synthetic CDRs at the 
corresponding position(s). Typically, several (e.g., 3 to about 50) known CDR sequences 
are compared and observed natural sequence variations between the known CDRs are 
tabulated, and a collection of oUgonucleotides encoding CDR peptide sequences 
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encompassing all or most permutations of the observed natural sequence variations is 
synthesized. For example but not for limitation, if a collection of human VH CDR 
sequences have carboxy-terminal amino acids which are either lyr, Val, Phe, or Asp, then 
the pool(s) of synthetic CDR oligonucleotide sequences are designed to allow the 
5 carboxy-terminal CDR residue to be any of these amino acids. In some embodiments, 
residues other than those which naturally-occur at a residue position in the collection of 
CDR sequences are incorporated: conservative amino acid substitutions are frequently 
incorporated and up to 5 residue positions may be varied to incorporate non-conservative 
amino acid substitutions as compared to known naturally-occurring CDR sequences. 
10 Such CDR sequences can be used in primary library members (prior to first round 
screening) and/or can be used to spike in vitro shuffling reactions of selected library 
member sequences. Construction of such pools of defined and/or degenerate sequences 
will be readily accompUshed by those of ordinary skill in the art. 

15 The collection of synthetic CDR sequences comprises at least one member that is 

not known to be a naturally-occurring CDR sequence. It is within the discretion of the 
practitioner to include or not include a portion of random or pseudorandom sequence 
corresponding to N region addition in the heavy chain CDR; the N region sequence 
ranges from 1 nucleotide to about 4 nucleotides occurring at V-D and D-J jmctions. A 

20 collection of synthetic heavy chain CDR sequences comprises at least about 100 unique 
CDR sequences, typically at least about 1,000 unique CDR sequences, preferably at least 
about 10,000 unique CDR sequences, frequently more than 50,000 unique CDR 
sequences; however, usually not more than about 1x10 6 unique CDR sequences are 
included in the collection, although occasionally 1 x 107 to 1 A" 108 unique CDR 

25 sequences are present, especially if conservative amino acid substitutions are permitted at 
positions where the conservative amino acid substituent is not present or is rare (i.e., less 
than 0. 1 percent) in that position in naturally-occurring human CDRS. In general, the 
number of unique CDR sequences included in a Ubrary should not exceed the expected 
number of primary transformants m the library by more than a factor of 1 0. Such 
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single-chain antibodies generally bind of about at least 1 x 10 m-, preferably with an 
afiOnity of about at least 5 x lO'' M-1, more preferably with an aflSnity of at least 1x10^ 
M-1 to 1 X 10^ M-1 or more, sometimes up to 1 x 10^^ M-1 or more. Frequently, the 
predetermined antigen is a human protein, such as for example a human cell surface 
5 antigen (e. g., CD4, CDS, IL-2 receptor, EGF receptor, PDGF receptor), other human 
biological macromolecule (e.g., thrombomodulin, protein C, carbohydrate antigen, sialyl 
Lewis antigen, Lselectin), or nonhuman disease associated macromolecule (e.g., bacterial 
LPS, virion capsid protein or envelope glycoprotein) and the like. 

10 High affinity single-cham antibodies of the desired specificity can be engineered 

and expressed in a variety of systems. For example, scfv have been produced in plants 
(Firek et al, 1993) and can be readily made in prokaryotic systems (Owens and Young, 
1994; Johnson and Bird, 1991). Furthermore, the single-chain antibodies can be used as 
a basis for constructing whole antibodies or various fragments thereof (Kettieborough et 

15 al, 1994). The variable region encoding sequence may be isolated (e.g., by PGR 
ampUfication or subcloning) and spUced to a sequence encoding a desired human 
constant region to encode a himian sequence antibody more suitable for himian 
therapeutic uses where inraiunogenicity is preferably minimized. The polynucleotide(s) 
having the resultant fully human encoding sequence(s) can be expressed in a host cell 

20 (e.g., from an expression vector in a mammalian cell) and purified for pharmaceutical 
formulation. 

The DNA expression constructs will typically include an expression control DNA 
sequence operably linked to the coding sequences, including naturally-associated or 
25 heterologous promoter regions. Preferably, the expression control sequences will be 
eukaryotic promoter systems in vectors capable of transforming or transfecting 
eukaryotic host cells. Once the vector has been incorporated into the appropriate host, 
the host is maintained under conditions suitable for high level expression of the 
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nucleotide sequences, and the collection and purification of the mutant' "engineered" 
antibodies. 

As stated previously, the DNA sequences will be expressed in hosts after the 
5 sequences have been operably linked to an expression control sequence (i.e., positioned 
to ensure the transcription and translation of the structural gene). These expression 
vectors are typically replicable in the host organisms either as episomes or as an integral 
part of the host chromosomal DNA. Commonly, expression vectors will contain 
selection markers, e.g., tetracycline or neomycin, to permit detection of those cells 
10 transformed with the desired DNA sequences (see, e.g., USPN 4,704,362, which is 
incorporated herein by reference). 

In addition to eukaryotic microorganisms such as yeast, mammalian tissue cell 
culture may also be used to produce the polypeptides of the present invention (see 

15 Winnacker, 1987), which is incorporated herein by reference). Eukaryotic cells are 

actually preferred, because a number of suitable host cell lines capable of secreting intact 
immunoglobulins have been developed in the art, and include the CHO cell lines, various 
COS cell lines, HeLa cells, and myeloma cell lines, but preferably transformed Bcells or 
hybridomas. Expression vectors for these cells can include expression control sequences, 

20 such as an origin of repUcation, a promoter, an enhancer (Queen et al, 1986), and 

necessary processing infonmation sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites, and transcriptional terminator sequences. Preferred expression 
control sequences are promoters derived bom immunoglobulin genes, cytomegalovirus, 
SV40, Adenovirus, Bovine Papilloma Virus, and the like. 

25 

Eukaryotic DNA transcription can be increased by inserting an enhancer 
sequence into the vector. Enhancers are cis-acting sequences of between 10 to 300 bp 
that increase transcription by a promoter. Enhancers can effectively increase 
transcription when either 51 or 31 to the transcription unit. They are also effective if 
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located within an intron or within the coding sequence itself. Typically, viral enhancers 
are used, including SV40 enhancers, cytomegalovirus enhancers, polyoma enhancers, and 
adenovirus enhancers. Enhancer sequences from mammalian systems are also commonly 
used, such as the mouse immunoglobulin heavy chain enhancer. 

5 

Mammalian expression vector systems will also typically include a selectable 
marker gene. Examples of suitable markers include, the dihydro folate reductase gene 
(DHFR), the thymidine kinase gene (TK), or prokaryotic genes conferring drug 
resistance. The first two marker genes prefer the use of mutant cell lines that lack the 
1 0 ability to grow without the addition of thymidine to the growth medium. Transformed 
cells can then be identified by their ability to grow on non-supplemented media. 
Examples of prokaryotic drug resistance genes useful as markers include genes 
conferring resistance to G418, mycophenolic acid and hygromycin. 

1 5 The vectors containing the DNA segments of interest can be transferred into tiie 

host cell by well-known methods, depending on the type of cellular host. For example, 
calcium chloride transfection is commonly utilized for prokaryotic cells, whereas calcium 
phosphate treatment, lipofection, or electroporation may be used for other cellular hosts. 
Other methods used to transform mammalian cells include the use of Polybrene, 

20 protoplast fusion, Uposomes, electroporation, and micro-injection (see, generally, 
Sambrook et al, 1982 and 1989). 

Once expressed, the antibodies, individual mutated immunoglobulin chains, 
mutated antibody fragments, and other immunoglobulin polypeptides of the invention can 
25 be purified according to standard procedures of the art, including ammonium sulfate 
precipitation, fraction column chromatography, gel electrophoresis and the like (see. 
generally . Scopes, 1982). Once purified, partially or to homogeneity as desired, the 
polypeptides may then be used therapeutically or in developing and performing assay 
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procedures, immunofluorescent stainings, and the like (see, generally, Lefkovits and 
Pernis, 1979 and 1981; Lelkovits, 1997). 

The antibodies generated by the method of the present invention can be used for 
5 diagnosis and therapy. By way of illustration and not limitation, they can be used to treat 
cancer, autoimmune diseases, or viral infections. For treatment of cancer, the antibodies 
will typically bind to an antigen expressed preferentially on cancer cells, such as erbB-2, 
CEA, CD33, and many other antigens and binding members well known to those skilled 
in the art. - 

10 

TwQ-Hvbrid Based Screening Assavs 

Shuffling can also be used to recombinatorially diversify a pool of selected library 
1 5 members obtained by screening a two-hybrid screening system to identify library 
members which bind a predetermined polypeptide sequence. The selected Ubrary 
members are pooled and shuffled by in vitro and/or in vivo recombination. The shuffled 
pool can then be screened in a yeast two hybrid system to select library members which 
bind said predetermined polypeptide sequence (e. g., and SH2 domain) or which bind an 
20 alternate predetermined polypeptide sequence (e.g., an SH2 domain from another protein 
species). 

An approach to identifying polypeptide sequences which bind to a predetermined 
polypeptide sequence has been to use a so-called "two-hybrid" system wherein the 
25 predetermined polypeptide isequence is present in a fusion protein (Chien et al, 1991). 
This approach identifies protein-protein interactions in vivo through reconstitution of a 
transcriptional activator (Fields and Song, 1989), the yeast Gal4 transcription protein. 
Typically, the method is based on the properties of the yeast Gal4 protein, which consists 
of separable domains responsible for DNA-binding and transcriptional activation. 
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Polynucleotides encoding two hybrid proteins, one consisting of the yeast Gal4 
DNA-binding domain fixsed to a polypeptide sequence of a known protein and the other 
consisting of the Gal4 activation domain fused to a polypeptide sequence of a second 
protein, are constructed and introduced into a yeast host cell. Intermolecular binding 
5 between the two fusion proteins reconstitutes the Gal4 DNA-binding domain with the 
Gal4 activation domain, which leads to the transcriptional activation of a reporter gene 
(e.g., lacz, HIS3) which is operably Unked to a Gal4 binding site. Typically, the 
two-hybrid method is used to identify novel polypeptide sequences which interact with a 
known protein (Silver and Hunt, 1993; Durfee et al 1993; Yang et al, 1992; Laban et 

10 al 1993; Hardy et al, 1992; Bartel et al, 1993; and Vojtek et al, 1993). However, 
variations of the two-hybrid method have been used to identify mutations of a known 
protein that affect its binding to a second known protein (Li and Fields, 1993; Lalo et al, 
1993; Jackson et al, 1993; and Madura et al. 1993). Two-hybrid systems have also been 
used to identify interacting structural domains of two known proteins (Bardwell et al, 

15 1993; Chakrabarty et al, 1992; Staudinger et al, 1993; and Milne and Weaver 1993) 
or domains responsible for oligomerization of a single protein (Iwabuchi et al, 1993; 
Bogerd et al, 1993). Variations of two-hybrid systems have been used to study the in 
vivo activity of a proteolytic enzyme (Dasmahapatra et al, 1992). Alternatively, an E. 
coli/BCCP interactive screening system (Germino et al, 1993; Guarente, 1993) can be 

20 used to identify interacting protein sequences (i.e., protein sequences which 

heterodimerize or form higher order heteromultimers). Sequences selected by a two- 
hybrid system can be pooled and shuffled and introduced into a two-hybrid system for 
one or more subsequent rounds of screening to identify polypeptide sequences which 
bind to the hybrid containing the predetermined binding sequence. The sequences thus 

25 identified can be compared to identify consensus sequence(s) and consensus sequence 
kemals. 

In general, standard techniques of recombination DNA technology are described 
in various publications (e.g. Sambrook et al, 1989; Ausubel et al, 1987; and Berger and 
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Kimmel, 1987; each of which is incorporated herein in its entirety by reference. 
Polynucleotide modifying enzymes were used according to the manufacturer's 
recommendations. Oligonucleotides were synthesized on an Applied Biosystems Inc. 
Model 394 DNA synthesizer using ABI chemicals. If desired, PGR amplimers for 
5 amplifying a predetermined DNA sequence may be selected at the discretion of the 
practitioner. 

One microgram samples of template DNA are obtained and treated with U.V. light 
to cause the formation of dimers, including TT dimers, particularly purine dimers. U.V. 
10 exposure is limited so that only a few photoproducts are generated per gene on the 

template DNA sample. Multiple samples are treated with U.V light for varying periods 
of time to obtani template DNA samples with varying numbers of dimers from U.V 
exposure. 

15 A random priming kit which utilizes a non-proofreading polymease (for example, 

Prime-It n Random Primer Labeling kit by Stratagene Cloning Systems) is utilized to 
generate different size polynucleotides by priming at random sites on templates which are 
prepared by U.V light (as described above) and extending along the templates. The 
priming protocols such as described in the Prime-It II Random Primer Labeling kit may 

20 be utilized to extend the primers. The dimers formed by U.V exposure serve as a 

roadblock for the extension by the non-proofreading polymerase. Thus, a pool of random 
size polynucleotides is present after extension with the random primers is finished. 

The present invention is further directed to a method for generating a selected 
25 mutant polynucleotide sequence (or a population of selected polynucleotide sequences) 
typically in the form of ampUfied and/or cloned polynucleotides, whereby the selected 
polynucleotide sequences(s) possess at least one desired phenotypic characteristic (e.g., 
encodes a polypeptide, promotes transcription of linked polynucleotides, binds a protein, 
and the like) which can be selected for. One method for identifying hybrid polypeptides 
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that possess a desired structure or functional property, such as binding to a predetermined 
biological macromolecule (e.g., a receptor), involves the screening of a large library of 
polypeptides for individual library members which possess the desired structure or 
functional property conferred by the amino acid sequence of the polypeptide. 

5 

In one embodiment, the present invention provides a method for generating 
libraries of displayed polypeptides or displayed antibodies suitable for afBnity interaction 
screening or phenotypic screening. The mefliod comprises (1) obtaining a first plurality 
of selected library members comprising a displayed polypeptide or displayed antibody 

10 and an associated polynucleotide encoding said displayed polypeptide or displayed 
antibody, and obtaining said associated polynucleotides or copieis thereof wherein said 
associated polynucleotides comprise a region of substantially identical sequences, 
optimally introducing mutations into said polynucleotides or copies, (2) pooling the 
polynucleotides or copies, (3) producing smaller or shorter polynucleotides by 

1 5 interrupting a random or particularized priming and synthesis process or an amplification 
process, and (4) performing amplification, preferably PGR ampUfication, and optionally 
mutagenesis to homologously recombine the newly synthesized polynucleotides. 



It is a particularly preferred object of the invention to provide a process for 
20 producing hybrid polynucleotides which express a useful hybrid polypeptide by a series 
of steps comprising: 

(a) producing polynucleotides by interrupting a polynucleotide amplification 
or synthesis process with a means for blocking or interrupting the amplification or 
synthesis process and thus providing a plurality of smaller or shorter polynucleotides due 

25 to the replication of the polynucleotide being in various stages of completion; 

(b) adding to the resultant population of single- or double-stranded 
polynucleotides one or more single- or double-stranded oligonucleotides, wherein said 
added oUgonucleotides comprise an area of identity in an area of heterology to one or 
more of the single- or double-stranded polynucleotides of the population; ^ 
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(c) denaturing the resulting single- or double-stranded oligonucleotides to 
produce a mixture of single-stranded polynucleotides, optionally separating the shorter or 
smaller polynucleotides into pools of polynucleotides having various lengths and further 
optionally subjecting said polynucleotides to a PGR procedure to amplify one or more 

5 oUgonucleotides comprised by at least one of said polynucleotide pools; 

(d) incubating a plurality of said polynucleotides or at least one pool of said 
polynucleotides with a polymerase under conditions which result in annealing of said 
single-stranded polynucleotides at regions of identity between the single-stranded 
polynucleotides and thus forming of a mutagenized double-stranded polynucleotide 

10 chain; 

(e) optionally repeating steps (c) and (d); 

(f) expressing at least one hybrid polypeptide from said polynucleotide chain, 
or chains; and 

(g) screening said at least one hybrid polypeptide for a useful activity. 

15 hi a preferred aspect of the invention, the means for blocking or interrupting the 

amplification or synthesis process is by utilization of uv Ught, DNA adducts, DNA 
binding proteins. 

In one embodunent of the invention, the DNA adducts, or polynucleotides 
20 comprising the DNA adducts, are removed from the polynucleotides or polynucleotide 
pool, such as by a process including heating the solution comprising the DNA fragments 
prior to further processing. 

Having thus disclosed exemplary embodiments of the present invention, it should 
25 be noted by those skilled in the art that the disclosures are exemplary only and that 

various other alternatives, adaptations and modifications may be made within the scope 
of the present invention. Accordingly, the present invention is not limited to the specific 
embodiments as illustrated herein. 
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Without further elaboration, it is believed that one skilled in the art can, using the 
preceding description, utilize the present invention to its fullest extent. The following 
examples are to be considered illustrative and thus are not limiting of the remainder of 
the disclosure in any way whatsoever. 



Example I 

feneration of Random Size P olynucleotides Using U.V. Induced PhotoproductS 

One microgram samples of template DNA are obtained and treated with U.V. light 
10 to cause the formation of dimers, including TT dimers, particularly purine dimers. U.V. 
exposure is limited so that only a few photoproducts are generated per gene on the 
template DNA sample. Multiple samples are treated with U.V. hght for varying periods 
of time to obtain template DNA samples with varying numbers of dimers Scorn U.V. 
exposure. 

15 A random priming kit which utilizes a non-proofreading polymerase (for 

example, Prime-It 11 Random Primer LabeUng kit by Stratagene Cloning Systems) is 
utilized to generate different size polynucleotides by priming at random sites on 
templates which are prepared by U.V light (as described above) and extending along the 
templates. The priming protocols such as described in the Prime-It n Random Primer 

20 Labeling kit may be utilized to extend the primers. The dimers formed by U.V. exposure 
serve as a roadblock for the extension by the non-proofreading polymerase. Thus, a pool 
of random size polynucleotides is present after extension with the random primers is 
finished. 
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IsolfttioB of Random Size FolynttcHcotiides 

Polynucleotides of interest which are generated according to Example 1 are gel 
isolated on a 1.5% agarose gel. Polynucleotides in the 100-300 bp range are cut out of 

5 the gel and 3 volumes of 6 M Nal is added to the gel slice. The mixture is incubated at 
50 "^C for 10 minutes and 10 )al of glass milk (Bio 101) is added. The mixture is spun for 
1 minute and the supernatant is decanted. The pellet is washed with 500 ^1 of Column 
Wash (Column Wash is 50% ethanol, lOmM Tris-HCl pH 7.5, 100 mM NaCl and 2.5 mM 
EDTA) and spin for 1 minute, after which the supernatant is decanted. The washing, 

10 spinning and decanting steps are then repeated. The glass milk pellet is resuspended in 
20}il of H2O and spun for 1 minute. DNA remains in the aqueous phase. 
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Example 3 

Shuffling of Isolated Random Size 100-300bD Polynucleotides 

The 100-300 bp polynucleotides obtained in Example 2 are recombined in an 
annealing mixture (0.2 mM each dNTP, 2.2 mM MgCh, 50 mM KCl, 10 mM Tris-HCl 
5 ph 8.8, 0. 1% Triton X-100, 0.3 ji; Taq DNA polymerase, 50 ^1 total volume) without 
adding primers. A Robocycler by Stratagene was used for the annealing step with the 
following program: 95 °C for 30 seconds, 25-50 cycles of [95 °C for 30 seconds, 50 - 60 
'^C (preferably 58 '^C) for 30 seconds, and 72 for 30 seconds] and 5 minutes at 72 °C. 
Thus, the 100-300 bp polynucleotides combine to yield double-stranded polynucleotides 
10 having a longer sequence. After separating out the reassembled double-stranded 

polynucleotides and denaturing fhera to form single stranded polynucleotides, the cycling 
is optionally again repeated with some samples utilizing the single strands as template 
and primer DNA and other samples utilizing random primers in addition to the single 
strands. 
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Example 4 

Screening of Polypeptides from Shuffled Polynucleotides 

The polynucleotides of Example 3 are separated and polypeptides are expressed 
therefrom. The original template DNA is utilized as a comparative control by obtaining . 
5 comparative polypeptides therefrom. The polypeptides obtained from the shuffled 
polynucleotides of Example 3 are screened for the activity of the polypeptides obtained 
from the original template and compared with the activity levels of the control. The 
shuffled polynucleotides coding for interesting polypeptides discovered during screening 
are compared further for secondary desirable traits. Some shuffled polynucleotides 
10 corresponding to less interesting screened polypeptides are subjected to reshuffling. 
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Example 5 

niri^rt^d EvolBtion an Enzvme bv Saturation Mutagengsfg 

Site-Saturation Mutagenesis: To accomplish site-saturation mutagenesis every residue 
5 (316) of a dehalogenase enzyme was converted into all 20 amino acids by site directed 
mutagenesis using 32-fold degenerate oligonucleotide primers, as follows: 

1 . A culture of the dehalogenase expression constmct was grown and a preparation of 
the plasmid was made 
10 2. Primers were made to randomize each codon - they have the common structure 

X2oNN(G/T)X2o 

3. A reaction mix of 25 ul was prepared containing --50 ng of plasmid template, 125 ng 
of each primer, IX native Pfu buffer, 200 uM each dNTP and 2.5 U native Pfii DNA 
polymerase 

15 4. The reaction was cycled in a Robo96 Gradient Cycler as follows: 
Initial denaturation at 95°C for 1 nun 

20 cycles of 95°C for 45 sec, 53°C for 1 min and 72^*0 for 11 min 
Final elongation step of 72°C for 10 min 

5. The reaction mix was digested with 10 U of Dpnl at 37°C for 1 hour to digest the 
20 methylated template DNA 

6. Two ul of the reaction mix were used to transform 50 ul of XLl-Blue MRF' cells and 
the entire transformation mix was plated on a large LB-Amp-Met plate yielding 200- 
1000 colonies 

7. Individual colonies were toothpicked into the wells of 96-well microtiter plates 
25 containing LB-Amp-IPTG and grown overnight 

8. The clones on these.plates were assayed the following day 



30 
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Screening: Approximately 200 clones of mutants for each position were grown in liquid 
media (384 well microliter plates) and screened as follows: 

5 

1 . Overnight ciiltures in 3 84-well plates were centrifuged and the media removed. 
To each well was added 0.06 mL 1 mM Tris/S04^' pH 7.8. 

2. Made 2 assay plates from each parent growth plate consisting of 0.02 mL cell 
suspension. 

10 3 . One assay plate was placed at room temperature and the other at elevated 

temperature (initial screen used 55*^0) for a period of time (initially 30 minutes). 
4. After the prescribed time 0.08 mL room temperature substrate (TCP saturated 1 
mM Tris/S04^" pH 7.8 with 1 .5 mM NaNs and 0.1 mM bromothymol blue) was 
added to each well. 

15 5 . Measurements at 620 mn were taken at various time points to generate a progress 
curve for each well 

6. Data were analyzed and the kinetics of the cells heated to those not heated were 
compared. Each plate contained 1-2 columns (24 wells) of immutated 20F12 
controls, 

20 7. Wells that appeared to have improved stability were re-grown and tested under the 
same conditions. 

Following this procedure nine single site mutations appeared to confer increased 
thermal stability on the enzyme. Sequence analysis was performed to determine of the 
25 exact amino acid changes at each position that were specifically responsible for the 
improvement. In sum, the improvement was conferred at 7 sites by one amino acid 
change alone, at an eighth site by each of two amino acid changes, and at a ninth site by 
each of three amino acid changes. Several mutants were then made each having a 
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plurality of these nine beneficial site mutations in combination; of these two mutants 
proved superior to all the other mutants, including those with single point mutations. 
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Example ^ 

Direct expression cloning using end-selection 

An esterase gene was amplified using 5 'phosphorylated primers in a standard 
5 PGR reaction (1 0 ng template; PGR conditions; 3* 94 C; [1* 94 C; T 50 C; 1 '30" 68 C] x 
30; 10' 68 C. 

Forward Primer = 95 1 ITopF 

(CTAGAAGGGAGGAGAATTACATGAAGCGGCTTTTAGCCC) 
10 Reverse Primer = 95 llTopR (AGCTAAGGGTCAAGGCCGCACCCGAGG) 
The resulting PGR product (ca.l000 bp) was gel purified and quantified. 

A vector for expression cloning, pASK3 (Institut fiier Bioanalytik, Goettingen, 
Germany), was cut with Xba I and Bgl 11 and dephosphorylated with CIP. 

15 

0.5 pmoles Vaccina Topoisomerase I (Invitrogen, Carlsbad, CA) was added to 60 
ng (ca. 0,1 pmole) purified PGR product for 5' 37 C in buffer NEB I (New England 
Biolabs, Beverly, MA) in 5 p,l total volume. 

The topogated PGR product was cloned into the vector pASK3 (5 ^il, ca. 200 ng in NEB 
20 I) for 5 ' at room temperature. 

This mixture was dialyzed against H2O for 30*. 

2 ^1 were used for electroporation of DHIOB cells (Gibco BRL, Gaithersburg, MD). 

EfiSciency; Based on the actual clone numbers this method can produce 2x10^ 
25 clones per jig vector. All tested recombinants showed esterase activity after induction 
with anhydrotetracycline. 
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Peh^lpgenasc Th^rm^l Stability 

This invention provides that a desirable property to be generated by directed 
5 evolution is exempUfied in a limiting fashion by an improved residual activity (e.g. an 
enzymatic activity, an immunoreactivity, an antibiotic acivity, etc.) of a molecule upon 
subjection to altered environment, including what may be considered a harsh enviroment, 
for a specified time. Suchaharshenvironmentmay comprise any combination of the 
following (iteratively or not, and in any order or permutation): an elevated temperature 
10 (including a temperature that may cause denaturation of a working enzyme), a decreased 
temperature, an elevated salinity, a decreased salinity, an elevated pH, a decreased pH, an 
elevated pressure, a decreassed pressure, and an change in exposure to a radiation source 
(including uv radiation, visible light, as well as the entire electromagnetic spectrum). 

15 The following example shows an application of directed evolution to evolve the 

ability of an enzyme to regain &/or retain activity upon exposure to an elevated 
temperature. 

Every residue (316) of a dehalogenase enzyme was converted into all 20 amino acids by 
site directed mutagenesis using 32-fold degenerate oligonucleotide primers. These 
20 mutations were introduced into the already rate-improved variant Dhla 20F12. 
Approximately 200 clones of each position were grown in liquid media (384 well 
microtiter plates) to be screened. The screening procedure was as follows: 

1 . Overnight cultures m 384-well plates were centrifiiged and the media removed. 
25 To each well was added 0.06 mL 1 mM Tris/SOa^" pH 7.8. 

2. The robot made 2 assay plates fi"om each parent growth plate consisting of 0.02 
mL cell suspension. 

3. One assay plate was placed at room temperature and the other at elevated 

^ temperature (initial screen used 55°C) for a period of time (initially 30 minutes). 
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4. After the prescribed time 0.08 mL room temperature substrate (TCP saturated 1 
mM Tris/S04^' pH 7.8 with 1.5 mM NaNa and 0.1 mM bromothymol blue) was 
added to each well. TCP = trichloropropane. 

5. Measurements at 620 nm were taken at various time points to generate a progress 
curve for each well. 

6. Data were analyzed and the kinetics of the cells heated to those not heated were 
compared. Each plate contained 1-2 columns (24 wells) of un-mutated 20F12 
controls. 

7. Wells that appeared to have improved stabihty were regrown and tested xmder the 
same conditions. 

Following this procedure nine single site mutations appeared to confer increased thermal 
stabihty on Dhla-20F12. Sequence analysis showed that the following changes were 
beneficial: 

D89G 

F91S 

T159L 

G189Q,G189V 
I220L 
N238T 
W251Y 

P302A, P302L, P302S, P302K 
P302R/S306R 

Only two sites (189 and 302) had more than one substitution. The first 5 on the Ust were 
combined (using G189Q) into a single gene (this mutant is referred to as *T)hla5"). All 
changes but S306R were incorporated into another variant referred to as DhlaS. 
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Thermal stability was assessed by incubating the enzyme at the elevated temperature 
(SS^'C and 80°C) for some period of time and activity assay at 30''C. Initial rates were 
plotted vs. time at the higher temperature. The enzyme was in 50 mM Tris/S04 pH 7.8 
for both the incubation and the assay. Product (CI') was detected by a standard method 
5 using Fe(N03)3 and HgSCN. Dhla 20F12 was used as the de facto wild type. The 
apparent half-life (T1/2) was calculated by fitting the data to an exponential decay 
function. 
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3. CLAIMS 

1 . A method for obtaining an immunomodulatory polynucleotide that has an 
optimized modulatory effect on an inmiune response, or encodes a polypeptide that has 
an optimized modulatory effect on an immune response, the method comprising: 

creating a library of non-stochastically generated progeny polynucleotides from a 
parental polynucleotide set; 

wherein optimization can thus be achieved using one or more of the directed 
evolution methods as described herein in any combination, permutation and iterative 
maimer; 

whereby these directed evolution methods include the introduction of mutations 
by non-stochastic methods, including by "gene site saturation mutagenesis" as described 
herein; 

and whereby these directed evolution methods also include the introduction 
mutations by non-stochastic polynucleotide reassembly methods as described herein; 
including by synthetic ligation polynucleotide reassembly as described herein. 

2. The method of claim 1 , wherein said optimized modulatory eflFect on an 
immune response is induced by a genetic vaccine vector. 

3. A method for obtaining an immxmomodulatory polynucleotide that has an 
optimized modulatory effect on an inmiune response, or encodes a polypeptide that has 
an optimized modulatory effect on an immune response, the method comprising: 

screening a library of non-stochastically generated progeny polynucleotides to 
identify an optimized non-stochastically generated progeny polynucleotide that has, or 
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encodes a polypeptide that has, a modulatory effect on an immune response; wherein the 
optimized non-stochastically generated polynucleotide or the polypeptide encoded by the 
non-stochastically generated polynucleotide exhibits an enhanced ability to modulate an 
immune response compared to a parental polynucleotide from which the library was 
created. 

4. The method of claim 3, wherein said optimized modulatory eflfect on an 
immune response is induced by a genetic vaccine vector. 

5. A method for obtaining an immunomodulatory polynucleotide that has an 
optimized modulatory efifect on an immune response, or encodes a polypeptide that has 
an optimized modulatory effect on an immune response, the method comprising: 

a) creating a library of non-stochastically generated progeny polynucleotides 
from a parental polynucleotide set; and 

b) screening the Ubrary to identify an optimized non-stochastically generated 
progeny polynucleotide that has, or encodes a polypeptide that has, a modulatory efifect 
on an immune response induced by a genetic vaccine vector, wherein the optimized non- 
stochastically generated polynucleotide or the polypeptide encoded by the non- 
stochastically. generated polynucleotide exhibits an enhanced ability to modulate an 
immune response compared to a parental polynucleotide from which the library was 
created; 

whereby optimization can thus be achieved using one or more of the directed 
evolution methods as described herein in any combination, permutation, and iterative 
manner; 
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whereby these directed evolution methods include the introduction of point 
mutations by non-stochastic methods, including by "gene site saturation mutagenesis" as 
described herein; 

and whereby these directed evolution methods also include the introduction 
mutations by non-stochastic polynucleotide reassembly methods as described herein; 
including by synthetic hgation polynucleotide reassembly as described herein. 

6. The method of claim 5, wherein said optimized modulatory effect on an 
immune response is induced by a genetic vaccine vector 

7. The method of any of claims 1-6, wherein the optimized non- 
stochastically generated polynucleotide is incorporated into a genetic vaccine vector. 

8. The method of any of claims 1 -6, wherein the optimized non- 
stochastically generated polynucleotide, or a polypeptide encoded by the optimized non- 
stochastically generated polynucleotide, is administered in conjunction with a genetic 
vaccine vector. 

9. The method of any of claims 1 -6, wherein the library of non-stochastically 
generated progeny polynucleotides is created by a process selected from the group 
consisting of gene reassembly, oligonucleotide-directed saturation mutagenesis, and any 
combination, permutation and iterative manner. 
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1 0. The method of any of claims 1 -6, wherein the optimized non- 
stochastically generated polynucleotide that has a modulatory effect on an immune 
response is obtained by: 

a) non-stochastically reassembling at least two parental template 
polynucleotide, each of which is, or encodes a molecule that is, involved in modulating 
an immune response; 

wherein the first and second parental templates differ from each other in two or 
more nucleotides, to produce a library of non-stochastically generated polynucleotides; 
and 

b) screening the library to identify at least one optimized non-stochastically 
generated polynucleotide that exhibits, either by itself or through the encoded molecule, 
an enhanced ability to modulate an inimime response in comparison to a parental 
polynucleotide fi-om which the library was created. 

11. The method of claim 1 0, wherein the method further comprises the steps 

of: 

c) subjecting a working optimized non-stochastically generated 
polynucleotide to a further round of non-stochastic reassembly with at least one 
additional polynucleotide, which is the same or different fi-om the first and second 
polynucleotides, to produce a further working library of recombinant polynucleotides; 

d) screening the fiirther working library to identify at least one further 
optimized non-stochastically generated polynucleotide that exhibits an enhanced ability 
to modulate an immxme response in comparison to a parental polynucleotide from which 
the Library was created; and 

e) optionally repeating c) and d) as necessary, until a desirable further 
optimized non-stochastically generated polynucleotide that exhibits an enhanced abihty 
to modulate an immime response than a form of the nucleic acid firom which the library 
was created. 
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12. The method of any of claims 1-6, wherein the optimized non- 
stochastically generated polynucleotide encodes a polypeptide that can interact with a 
cellular receptor involved in mediating an immune response; wherein the polypeptide 
acts as an agonist or antagonist of the receptor 

13 . The method of claim 12, wherein the cellular receptor is a macrophage 
scavenger receptor. 

14. The method of claim 12, wherein the cellular receptor is selected from the 
group consisting of a cytokine receptor and a chemokine receptor. 

15. The method of claim 14, wherein the chemokine receptor is CCR6. 

1 6. The method of claim 1 2, wherein the polypeptide mimics the activity of a 
natural ligand for the receptor but does not induce immune reactivity to said natural 
ligand. 

17. The method of claim 12, wherein the library is screened by: 

i) expressing the non-stochastically generated progeny polynucleotides so 
that the encoded polypeptides are produced as fusions with a protein displayed on the 
surface of arepHcable genetic package; 



-641 - 



wo 00/46344 



PCT/USOO/03086 



ii) contacting the replicable genetic packages with a plurality of cells that 
display the receptor, and 

iii) identifying cells that exhibit a modulation of an immune response 
mediated by the receptor. 

1 8 . The method of claim 1 7, wherein the replicable genetic package is 
selected from the group consisting of a bacteriophage, a cell, a spore, and a virus. 

19. The method of claim 18, wherein the replicable genetic package is an M13 
bacteriophage and the protein is encoded by genelll or geneVni. 

2b. The method of claim 12, which method further comprises introducing the 
optimized non-stochastically generated polynucleotide into a genetic vaccine vector and 
administering the vector to a mammal, wherein the peptide or polypeptide is expressed 
and acts as an agonist or antagonist of the receptor. 

2 1 . The method of claim 12, which method further comprises producing the 
polypeptide encoded by the optimized non-stochastically generated polynucleotide and 
introducing the polypeptide into a mammal in conjunction with a genetic vaccine vector. 

22. The method of claim 12, wherein the optimized non-stochastically 
generated polynucleotide is inserted into an antigen-encoding nucleotide sequence of a 
genetic vaccine vector. 
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23. The method of claim 22, wherein the optimized non-stochastically 
generated polypeptide is introduced into a nucleotide sequence that encodes an M- loop 
of an HBsAg polypeptide. 

24. The method of any of claims 1 -6, wherein the opthnized non- 
stochastically generated polynucleotide comprises a nucleotide sequence rich in 
unmethylated CpG. 

25. The method of any of claims 1-6, wherein the optimized non- 
stochastically generated polynucleotide encodes a polypeptide that inhibits an allergic 
reaction. 

26. The method of claim 25, wherein the polypeptide is selected from the 
group consisting of interferon- , interferon- , IL- 10, IL- 12, an antagonist of IL-4, an 
antagonist of IL-5, and an antagonist of IL-13. 

27. The method of 1 , wherein the optimized recombinant polynucleotide 
encodes an antagonist of IL-10. 

28. The method of claim 27, wherein the antagonist of IL-10 is soluble or 
defective IL-10 receptor or IL-20/MDA-7. 
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29. The method of any of claims 1-6, wherein the optimized non- 
stochastically generated polynucleotide encodes a co-stimulator. 

30. The method of claim 29, wherein the co-stimulator is B7- 1 (CD80) or B7- 
2 (CD86) and the screening step involves selecting variants with altered activity through 
CD28 or CTLA-4. 

31. The method of claim 29, wherein the co-stimulator is CDl, CD40, CD 154 
(ligand for CD40) or CD150 (SLAM). 

32. The method of claim 29, wherein the co-stimulator is a cytokine. 

33. The method of claim 32, wherein the cytokine is selected from the group 
consisting of IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL- 10, XL- 11, IL- 12, 
IL-13,IL-14,IL-15,IL-16,IL-17,IL-18,GM-CSF,G-CSF,TNF., IFN- , IFN-, and 
IL-20 (MDA-7). 

34. The method of 33, wherem the library of non-stochastically generated 
polynucleotides is screened by testing the ability of cytokines encoded by the non- 
stochastically generated polynucleotides to activate cells which contain a receptor for the 
cytokine. 
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35. The method of claim 34, wherein the cells contain a heterologous nucleic 
acid that encodes the receptor for the cytokine. 

36. The method of 33, wherein the cytokine is interleukin-12 and the 
screening is performed by: growing mammalian cells which contain the genetic vaccine 
vector in a culture medium; and detecting whether T cell proliferation or T cell 
differentiation is induced by contact with the culture medium. 

37. The method of 33, wherein the cytokine is interferon- and the screening 
is performed by: 

i) expressing the non-stochastically generated polynucleotides so that the 
encoded polypeptides are produced as fusions with a protein displayed on the surface of a 
replicable genetic package; 

ii) contacting the replicable genetic packages with a plurality of B cells; and 

iii) identifying phage library members that are capable of inhibiting 
proliferation of the B cells. 

38. The method of claim 33, wherein the hnmune response of interest is 
differentiation of T cells to ThI cells and the screening is performed by contacting a 
population of T cells with the cytokines encoded by the members of the library of 
recombinant polynucleotides and identifying library members that encode a cytokine that 
induces the T cells to produce IL-2 and interferon- . 

39. The method of claim 32, wherein the cytokine encoded by the optimized non- 
stochastically generated polynucleotide exhibits reduced immunogenicity compared to a 
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cytokine encoded by a non-optimized polynucleotide, and the reduced immunogenicity is 
detected by introducing a cytokine encoded by the non-stochastically generated 
polynucleotide into a raanamal and determining whether an immune response is induced 
against the cytokine. 

40. The method of claim 29, wherein the co-stimulator is B7-1 (CD80) or B7-2 
(CD86) and the cell is tested for ability to costimulate an inmnme response. 

41. The method of any of claims 1-6, wherein the optimized recombinant 
polynucleotide encodes a cytokine antagonist. 

42. The method of claim 41, wherein the cytokine antagonist is selected from the 
group consisting of a soluble cytokine receptor and a transmembrane cytokine receptor 
having a defective signal sequence, 

43. The method of claim 41, wherein the cytokine antagonist is selected from 
the group consisting of IL- 1 OR and IL-4R. 

44. The method of any of claims 1 -6, wherein the optimized non- 
stochastically generated polynucleotide encodes a polypeptide capable of inducing a 
predominantly ThI immime response. 
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45. The method of any of claims 1-6, wherein the optimized non- 
stochastically generated polynucleotide encodes a polypeptide capable of inducing a 
predominantly Th2 immune response. 

46. The method of any of claims 1-6, wherein said optimized modulatory 
effect on an immune response is a decrease in an \mwanted modulatory effect on an 
immune response; 

whereby application of the method can be used to generate a molecule having a 
decreased ability to elicit an immune response from a host recipient of said molecule, 
where said recipient can be a human or an animal host; 

and whereby appUcation of the method can thus be used to generate a molecule 
having decreased antigenicity with respect to at least one host recipient of said molecule. 



47. The method of any of claims 1-6, wherein said optimized modulatory 
eflfect on an immune response is an increase in a desirable modulatory effect on an 
immune response; 

whereby application of the method can be used to generate a molecule having an 
increased abihty to elicit an immune response from a host recipient of said molecule, 
where said recipient can be a human or an animal host; 

and whereby application of the method can thus be used to generate a molecule 
having increased antigenicity with respect to at least one host recipient of said molecule. 
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48. The method of any of claims 1-6, wherein said optimized modulatory 
effect on an immxme response is both a decrease in a first xmwanted modulatory eflFect on 
an immune response as well as an increase in a second desirable modulatory effect on an 
immime response; 

whereby application of the method can be used to generate a molecule having 
both a decreased ability to elicit a first immune response from a first host recipient of said 
molecule as well as a an increased abiUty to eUcit a second immune response from a 
second host recipient of said molecule; 

whereby the first and the second recipient hosts can be the same or different; 

whereby each of the first and the second recipient hosts can be a human or an 
animal host; 

and whereby application of the method can thus be used to generate a molecule 
having both a first decreased antigenicity with respect to at least one host recipient of said 
molecule and a second decreased antigenicity with respect to at least one host recipient of 
said molecule. 

49. The method of claim 48, wherein said first and said second modulatory 
effect on an immune response are evolved for respectively a first and a second module on 
the same multimodule vaccine vector; 

whereby a module is exemplified by the following modules, as well as by a 
fragment derivative or analog thereof: an antigen coding sequence, a polyadenylation 
sequence, a sequence coding for a co-stimulatory molecule, a sequence coding for an 
inducible repressor or transactivator, a eukaryotic origin or replication, a prokaryotic 
origin of replication, a sequence coding for a prokaryotic marker, , and enhancer, a 
promoter, and operator, and an intron. 
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50. The method of any of claims 1-6, wherein the optimized modulatory effect 
on an immune response is comprised of an increase in the stability of the 
immunomodulatory (IM) polynucleotide or polypeptide encoded thereby; 

whereby application of the method can be used to generate a molecule having an 
increaised stability ex vivo, thus, for example, increasing shelf-life and/or ease of storage 
and/or length of time before expiration of activity upon storage; 

and whereby application of the method can also be used to generate a molecule 
having an increased stability in vivo upon administration to a host recipient, thus, for 
example, increasing resistance to digestive acids and/or increasing stability in the 
circulation and/or any other method of elimination or destruction by the host recipient. 



5 1 . The method of any of claims 1 -6, wherein the immunomodulatory (IM) 
polynucleotide or polypeptide encoded thereby; has an optimized modulatory effect on an 
immxme response in a human host recipient; 

whereby application of the method can thus be used to generate an optimized 
genetic vaccine for human recipeints. 



52. The method of any of claims 1-6, wherein the immunomodulatory (EM) 
polynucleotide or polypeptide encoded thereby; has an optimized modulatory effect on an 
immune response in an animal host recipient; 
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whereby application of the method can thus be used to generate an optimized 
genetic vaccine for animal recipients, including animals that are farmed or raised by man, 
animals that are not farmed or raised by man, domesticated animals, and non- 
domesticated animals. 



53. A method for obtaining an optimized polynucleotide that encodes an 
accessory molecule that improves the transport or presentation of antigens by a cell, the 
method comprising: 

a) creating a library of non-stochastically generated polynucleotides by 
subjecting to optimization by non-stochastic directed evolution a parental polynucleotide 
set in which is encoded all or part of the accessory molecule; and 

b) screening the Hbrary to identify an optimized non-stochastically generated 
progeny polynucleotide that encodes a recombinant molecule that confers upon a cell an 
increased or decreased abiUty to transport or present an antigen on a surface of the cell 
compared to an accessory molecule encoded by template polynucleotides not subjected to 
the non-stochastic reassembly; 

whereby application of the method can thus be used to generate an optimized 
molecule for human recipients &/or animal recipients, including animals that are farmed 
or raised by man, animals that are not farmed or raised by man, domesticated animals, 
and non-domesticated ammals; 

whereby optimization can thus be achieved using one or more of the directed 
evolution methods as described herein in any combination, permutation, and iterative 
manner; 

whereby these directed evolution methods include the introduction of point 
mutations by non-stochastic methods, including by "gene site saturation mutagenesis" as 
described herein; 
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and whereby these directed evolution methods also include the introduction 
mutations by non-stochastic polynucleotide reassembly methods as described herein; 
including by synthetic ligation polynucleotide reassembly as described herein. 

54. The method of claim 53, wherein the screening involves: 

i) introducing the library of non-stochastically generated polynucleotides 
into a genetic vaccine vector that encodes an antigen to form a library of vectors; 
introducing the library of vectors into mammalian cells; and 

ii) identifying mammalian cells that exhibit increased or decreased 
immunogenicity to the antigen. 

55. The method of claim 53, wherein the accessory molecule comprises a 
proteasome or a TAP polypeptide. 

56. The method of claim 53, wherein the accessory molecule comprises a 
cytotoxic T-cell inducing sequence. 

57. The method of claim 56, wherein the cytotoxic T-cell inducing sequence is 
obtained from a hepatitis B surface antigen. 

58. The method of claim 53, wherein the accessory molecule comprises an 
immunogenic agonist sequence. 
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59. A method for obtaining an immunomodulatoiy polynucleotide that has, an 
optimized expression in a recombinant expression host, the method comprising: 

creating a library of non-stochastically generated progeny polynucleotides from a 
parental polynucleotide set; 

whereby optimization can thus be achieved using one or more of the directed 
evolution methods as described herein in any combination, permutation and iterative 
manner; 

whereby these directed evolution methods include the introduction of mutations 
by non-stochastic methods, including by "gene site saturation mutagenesis** as described 
herein; 

and whereby these directed evolution methods also include the introduction 
mutations by non-stochastic polynucleotide reassembly methods as described herein; 
including by synthetic ligation polynucleotide reassembly as described herein. 

60. A method for obtaining an immunomodulatory polynucleotide that has an 
optimized expression in a recombinant expression host, the method comprising: 

screening a library of non-stochastically generated progeny polynucleotides to 
identify an optimized non-stochastically generated progeny polynucleotide that has an 
optimized expression in a recombinant expression host when compared to the expression 
of a parental polynucleotide from which the library was created. 

61. A method for obtaining an immunomodulatory polynucleotide that has an 
optimized expression in a recombinant expression host, the method comprising: 

a) creating a library of non-stochastically generated progeny polynucleotides 
from a parental polynucleotide set; and 
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b) screening a library of non- stochastically generated progeny 
polynucleotides to identify an optimized non-stochastically generated progeny 
polynucleotide that has an optimized expression in a recombinant expression host when 
compared to the expression of a parental polynucleotide from which the library was 
created; 

whereby optimization can thus be achieved using one or more of the directed 
evolution methods as described herein in any combination, permutation, and iterative 
maimer; 

whereby these directed evolution methods include the introduction of point 
mutations by non-stochastic methods, including by "gene site saturation mutagenesis" as 
described herein; 

and whereby these directed evolution methods also include the introduction 
mutations by non-stochastic polynucleotide reassembly methods as described herein; 
including by synthetic ligation polynucleotide reassembly as described herein. 

62. The method of any of claims 59-61, wherein the recombinant expression 
host is a prokaryote. 

63. The method of any of claims 59-61, wherein the recombinant expression 
host is a eukaiyote. 

64. The method of claim 63, wherein the recombinant expression host is a 

plant. 
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65. The method of any of claims 64, wherein the recombinant expression host 
is a monocot. 

66. The method of any of claims 64, wherein the recombinant expression host 
is a dicot. 

67. The method of any of claims 1 -6, 53, or 59-61, wherein creating a library 
of non-stochastically generated progeny polynucleotides from a parental polynucleotide 
set is comprised of subjecting the parental polynucleotide set to "gene site saturation 
mutagenesis" as described herein. 

68. The method of any of claims 1-6, 53, or 59-61, wherein creating a library 
of non-stochastically generated progeny polynucleotides from a parental polynucleotide 
set is comprised of subjecting the parental polynucleotide set to "synthetic hgation 
polynucleotide reassembly" as described herein. 

69. The method of any of claims 1-6, 53, or 59-61, wherein creating a library 
of non-stochastically generated progeny polynucleotides from a parental polynucleotide 
set is comprised of subjecting the parental polynucleotide set to both "gene site saturation 
mutagenesis" as described herein, and to "synthetic ligation polynucleotide reassembly** 
as described herein. 
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70. A method of producing a progeny polynucleotide set by subjecting a 
double-stranded circular parental polynucleotide molecule to mutagenesis, said method 
comprising the steps of: 

a) annealing a first primer and a second primer to said parental 
polynucleotide molecule; 

wherein said first primer is comprised of a first primer sequence that is 
complementary to a first anneahnent region of the parental polynucleotide molecule, 

wherein said second primer is comprised of a second primer sequence that is 
complementary to a second annealment region of the parental polynucleotide molecule, 

wherein said first annealment region and said second armealment region are non- 
overlapping and therefore staggered, 

and wherein at least one of said first and second primers contains a non-stochastic 
mutagenic cassette with respect to the parental polynucleotide molecule; and 

b) synthesizing by means of a polymerase-catalyzed amplification reaction a 
first progeny polynucleotide strand comprised of said first primer and a second progeny 
polynucleotide strand comprised of said second primer; 

wherein the first progeny polynucleotide strand and the second progeny 
polynucleotide strand may form a double-stranded mutagenized circular polynucleotide 
product. 

71. A method of producing a progeny polynucleotide set by subjectmg a 
double-stranded circular parental polynucleotide molecule to mutagenesis, said method 
comprising the steps of: 

a) annealing a first primer and a second primer to said parental 
polynucleotide molecule; 
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wherein said first primer is comprised of a first primer sequence that is 
complementary to a first anneahnent region of the parental polynucleotide molecule, 

wherein said second primer is comprised of a second primer sequence that is 
complementary to a second annealment region of the parental polynucleotide molecule, 

wherein said first annealment region and said second annealment region are non- 
overlapping and therefore staggered, 

wherein at least one of said first and second primers contains a non-stochastic 
mutagenic cassette with respect to the parental polynucleotide molecule, and 

wherein said non-stochastic mutagenic cassette contained in said at least one 
primer is degenerate in nature; and 

b) synthesizing by means of a polymerase-catalyzed amplification reaction a 
first progeny polynucleotide strand comprised of said first primer and a second progeny 
polynucleotide strand comprised of said second primer; 

wherein the first progeny polynucleotide strand and the second progeny 
polynucleotide strand may form a double-stranded mutagenized circular polynucleotide 
product; 

whereby the generation of a degenerate progeny polynucleotide set may be 
achieved by applying said method. 

72. A method for producing from a template polypeptide a set of progeny 
polypeptides in which a non-stochastic range of single amino acid substitutions is 
represented at each amino acid position, comprising the steps of: 

a) subjecting a codon-containing template polynucleotide to polymerase- 
based ampUfication using a degenerate oligonucleotide for each codon to be 
mutagenized, wherein each of said degenerate oligonucleotides is comprised of a 
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first homologous sequence and a degenerate trinucleotide cassette, so as to 
generate a set of progeny polynucleotides; and 

b) subjecting said set of progeny polynucleotides to clonal amplification such 
that polypeptides encoded by the progeny polynucleotides are expressed; 

whereby, said method provides a means for generating a predetermined number of 
amino acids to be represented at each amino acid site along a parental polypeptide 
template, up to as many as all 20 amino acids at each of said amino acid sites. 



73. The method of claim 72, wherein said degenerate oligonucleotide is 
comprised of a first homologous sequence, a degenerate trinucleotide cassette, and a 
second homologous sequence. 



74. The method of claim 72, wherein said degenerate trinucleotide cassette is 
comprised of a first mononucleotide cassette selected fi-om the group consisting of: 
a degenerate A/C mononucleotide cassette, 
a degenerate A/G mononucleotide cassette, 
a degenerate A/T mononucleotide cassette, ^ 
a degenerate C/G mononucleotide cassette, 
a degenerate C/T mononucleotide cassette, 
a degenerate G/T mononucleotide cassette, 
a degenerate C/G/T mononucleotide cassette, 
a degenerate A/G/T mononucleotide cassette, 
a degenerate A/C/T mononucleotide cassette, 
a degenerate A/C/G mononucleotide cassette, 
and a degenerate N or A/C/G/T mononucleotide cassette; 



-657- 



wo 00/46344 



PCT/USOO/03086 



and wherein said degenerate trinucleotide cassette is further comprised of a 
second and a third mononucleotide cassette, each selected from the group consisting of: 
a degenerate A/C mononucleotide cassette, 
a degenerate A/G mononucleotide cassette, 
a degenerate A/T mononucleotide cassette, 
a degenerate C/G mononucleotide cassette, 
a degenerate C/T mononucleotide cassette, 
a degenerate G/T mononucleotide cassette, 
a degenerate C/G/T mononucleotide cassette 
a degenerate A/G/T mononucleotide cassette, 
a degenerate A/C/T mononucleotide cassette, 
a degenerate A/C/G mononucleotide cassette, 
a degenerate N or A/C/G/T mononucleotide cassette, 
a non-degenerate A mononucleotide cassette, 
a non-degenerate C mononucleotide cassette, 
a non-degenerate G mononucleotide cassette, 
and a non-degenerate T mononucleotide cassette. 

75. The method of claim 72, where said degenerate trinucleotide cassette is 

r 

selected from the group consisting of: 

a degenerate N,N,N trinucleotide cassette, 
a degenerate N,N,G/T trinucleotide cassette, 
a degenerate N,N,G/C trinucleotide cassette, 
a degenerate N,N,A/C/G trinucleotide cassette, 
a degenerate N,N^G/T trinucleotide cassette, 
and a degenerate N,N,C/G/T trinucleotide cassette; 
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whereby, said method provides a means for generating all 20 amino acid changes 
at each amino acid site along a parental polypeptide template, because the 
degeneracy of the specified trinucleotide cassette sequences includes codons for 
all 20 amino acids. 

76. The method of claim 72, wherein said degenerate oligonucleotide is 
comprised of a first homologous sequence and a plurality of trinucleotide cassettes; 

whereby, said method provides a means for generating a progeny polypeptide 
having a plurality of concurrent single amino acid changes with respect to a parental 
polypeptide template. 

77. The method of claim 76, wherein each of said degenerate trinucleotide 
cassettes is comprised of a first mononucleotide cassette selected firom the group 
consisting of: 

a degenerate A/C mononucleotide cassette, 
a degenerate A/G mononucleotide cassette, 
a degenerate A/T mononucleotide cassette, 
a degenerate C/G mononucleotide cassette, 
a degenerate C/T mononucleotide cassette, 
a degenerate G/T mononucleotide cassette, 
a degenerate C/G/T mononucleotide cassette, 
a degenerate A/G/T mononucleotide cassette, 
a degenerate A/C/T mononucleotide cassette, 
a degenerate A/C/G mononucleotide cassette. 
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and a degenerate N or A/C/G/T mononucleotide cassette; 

and wherein each of said degenerate trinucleotide cassettes is further comprised of 
a second and a third mononucleotide cassette, each selected from the group of consisting 
of: 

a degenerate A/C mononucleotide cassette, 

a degenerate A/G mononucleotide cassette, 

a degaierate A/T mononucleotide cassette, 

a degenerate C/G mononucleotide cassette, 

a degenerate C/T mononucleotide cassette, 

a degenerate G/T mononucleotide cassette, 

a degenerate C/G/T mononucleotide cassette 

a degenerate A/G/T mononucleotide cassette, 

a degenerate A/C/T mononucleotide cassette, 

a degenerate A/C/G mononucleotide cassette, 

a degenerate N or A/C/G/T mononucleotide cassette, 

a non-degenerate A mononucleotide cassette, 

a non-degenerate C mononucleotide cassette, 

a non-degenerate G mononucleotide cassette, 

and a non-degenerate T mononucleotide cassette. 

78. The method of claim 76, where said degenerate trinucleotide cassette is 
selected from the group consisting of: 

a degenerate N,N,N trinucleotide cassette, 
a degenerate N,N,G/T trinucleotide cassette, 
a degenerate N,N,G/C trinucleotide cassette, 
a degenerate N,N,A/C/G trinucleotide cassette, 
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a degenerate N,N,A/G/T trinucleotide cassette, 
and a degenerate N,N,C/G/T trinucleotide cassette; 

whereby, said method provides a means for generating all 20 amino acid changes 
at each amino acid site along a parental polypeptide template, because the 
degeneracy of the specified trinucleotide cassette sequences includes codons for 
all 20 amino acids. 

79. The method of claim 72, wherein said degenerate oUgonucleotide is 
comprised of a first homologous sequence, and a plurality of trinucleotide cassettes, and a 
second homologous sequence. 

80. A method for producing firom a template polypeptide a set of progeny 
polypeptides in which a non-stochastic range of single amino acid substitutions is 
represented at each amino acid position, and for identifying desirable amino acid 
substitutions and combinations thereof among the progeny molecules, comprising the 
steps of: 

a) subjecting a codon-containing template polynucleotide to polymerase- 
based amplification using a degenerate oligonucleotide cassette for each codon to 
be mutagenized, wherein each of said degenerate oligonucleotides is comprised of 
a first homologous sequence and a degenerate trinucleotide cassette, so as to 
generate a set of progeny polynucleotides; and 

b) subjecting said set of progeny polynucleotides to clonal amplification such 
that polypeptides encoded by the progeny polynucleotides are expressed; and 
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c) subjecting said expressed progeny polypeptides to screening in order to 
compare them to the parental polynucleotide with respect to at least one molecular 
property of interest; 

whereby, said method provides a means for generating a predetermined number of 
amino acids to be represented at each amino acid site along a parental polypeptide 
template, up to as many as all 20 amino acids at each of said amino acid sites; and 

whereby, said method provides a means for identifying among said progeny 
polypeptides those that display a desirable change with respect to at least one 
molecular property when compared with its parental polypeptide. 

81 . The method of claim 80, wherein said degenerate trinucleotide cassette is 
comprised of a first nucleotide selected firom the group consisting of: 
a degenerate A/C mononucleotide cassette, 
a degenerate A/G mononucleotide cassette, 
a degenerate A/T mononucleotide cassette, 
a degenerate C/G mononucleotide cassette, 
a degenerate C/T mononucleotide cassette, 
a degenerate G/T mononucleotide cassette, 
a degenerate C/G/T mononucleotide cassette, 
a degenerate A/G/T mononucleotide cassette, 
a degenerate A/C/T mononucleotide cassette, 
a degenerate A/C/G mononucleotide cassette, 
and a degenerate N or A/C/G/T mononucleotide cassette; 

and wherein said degenerate trinucleotide cassette is further comprised of a 
second and a third mononucleotide cassette, each selected jfrom the group consisting of: 
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a degenerate A/C mononucleotide cassette, 

a degenerate A/G mononucleotide cassette, 

a degenerate A/T mononucleotide cassette, 

a degenerate C/G mononucleotide cassette, 

a degenerate C/T mononucleotide cassette, 

a degenerate G/T mononucleotide cassette, 

a degenerate C/G/T mononucleotide cassette 

a degenerate A/G/T mononucleotide cassette, 

a degenerate A/C/T mononucleotide cassette, 

a degenerate A/C/G mononucleotide cassette, 

a degenerate N or A/C/G/T mononucleotide cassette, 

a non-degenerate A mononucleotide cassette, 

a non-degenerate C mononucleotide cassette, 

a non-degenerate G mononucleotide cassette, 

and a non-degenerate T mononucleotide cassette. 

82. The method of claim 80, where said degenerate trinucleotide cassette is 
selected fix»m the group consisting of: 

a degenerate N,N,N trinucleotide cassette, 
a degenerate N,N,G/T trinucleotide cassette, 
a degenerate N,N,G/C trinucleotide cassette, 
a degenerate N,N,A/C/G trinucleotide cassette, 
a degenerate N,N,A/G/T trinucleotide cassette, 
and a degenerate N,N,C/G/T trinucleotide cassette; 

whereby, said method provides a means for generating all 20 amino acid changes 
at each amino acid site along a parental polypeptide template, because the 
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degeneracy of the specified trinucleotide cassette sequences includes codons for 
all 20 amino acids. 



83 . The method of claim 80, wherem said degenerate oligonucleotide is 
comprised of a first homologous sequence and a plurality of trinucleotide cassettes; 

whereby, said method provides a means for generating a progeny polypeptide 
having a plurality of concurrent single amino acid changes with respect to a parental 
polypeptide template. 

84. The method of claim 80, wherein each of said degenerate trinucleotide 
cassettes is comprised of a first mononucleotide cassette selected fi'om the group 
consisting of: 

a degenerate A/C mononucleotide cassette, 

a degenerate AJG mononucleotide cassette, 

a degenerate ATI mononucleotide cassette, 

a degenerate C/G mononucleotide cassette, 

a degenerate C/T mononucleotide cassette, 

a degenerate G/T mononucleotide cassette, 

a degenerate C/G/T mononucleotide cassette, 

a degenerate A/G/T mononucleotide cassette, 

a degenerate A/C/T mononucleotide cassette, 

a degenerate A/C/G mononucleotide cassette, 

and a degenerate N or A/C/G/T mononucleotide cassette; 
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and wherein each of said degenerate trinucleotide cassettes is further comprised of 
a second and a third mononucleotide cassette, each selected from the group consisting of: 
a degenerate A/C mononucleotide cassette, 
a degenerate A/G mononucleotide cassette, 
a degenerate A/T mononucleotide cassette, 
a degenerate C/G mononucleotide cassette, 
a degen^ate C/T mononucleotide cassette, 
a degenerate G/T mononucleotide cassette, 
a degenerate C/G/T mononucleotide cassette 
a degenerate A/G/T mononucleotide cassette, 
a degenerate A/C/T mononucleotide cassette, 
a degenerate A/C/G mononucleotide cassette, 
a degenerate N or A/C/G/T mononucleotide cassette, 
a non-degenerate A mononucleotide cassette, 
a non-degenerate C mononucleotide cassette, 
a non-degenerate G mononucleotide cassette, 
and a non-degenerate T mononucleotide cassette. 
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85. The method of claim 80, where said degenerate trinucleotide cassette is 
selected from the group consisting of: 

a degenerate N,N,N trinucleotide cassette, 
a degenerate N,N,G/T trinucleotide cassette, 
a degenerate N,N,G/C trinucleotide cassette, 
a degenerate N,N,A/C/G trinucleotide cassette, 
a degenerate N,N,A/G/T trinucleotide cassette, 
and a degenerate N,N,C/G/T trinucleotide cassette; 

whereby, said method provides a means for generating all 20 amino acid changes 
at each amino acid site along a parental polypeptide template, because the 
degeneracy of the specified trinucleotide cassette sequences includes codons for 
all 20 amino acids. 
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Figure 2. Generation of A Nucleic 
Acid Building Block by Polymerase- 
Based Amplification. 
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FIGURE 3. Unique Overhangs And Unique Couplings. 

The nmnber of unique overhangs of each size (e.g. the total nuniber of unique overhangs 
composed of 1 or 2 or 3, etc. nucleotides) exceeds the number of unique couplings that can 
result from the use of all the unique overhangs of that size. For example, the total number of 
unique couplings that can be made using all the 8 unique single-miclcotide 3' overhangs and 
single-nucleotide 5' overhangs is 4. 



PANEL A. 4 unique smglc-nucleotide 3' overhangs are possible (Le., A, C, G, & T). For 
each of these there is a complcmentaiy 3* overixang with which it can pair (Le., T, G, Q & 
A, respectively), as shown. 



Al 



_JII 
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PANEL B. However, the number of unique single-nuclcotidc 3' overhangs is greater than 
the number of unique couplings. Thus, only 2 intrinsically unique couplings exist using 
single-nucleotide 3* overhangs as shown. 



rr 



PANEL C. Likewise, 4 unique-single nucleotide 5' overhangs arc possible (i.e.. A, C, G, 
& T). For each of these there is a complementary 5* overhang with which it can pair (Le., 
T, G, C, & A, respectively), as shown. 



2s: ns: 



PANEL D. However, the number of unique single-nucleotide 5' overhangs is greater than 
the number of unique couplings. Thus, only 2 intrinsically unique couplings exist using 
single-nucleotide 5' overhangs as shown. 
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FIGURE 4. Unique Overall Assembly Order Achieved by Sequentially 
Coupling the Building Blocks 

Awareness of the degeneracy (between the number of unique overhangs and the number of 
unique couplings) is important in order to avoid the production of degeneracy in the overall 
assembly ozder of the finalized nucleic acid. However, a unique overall assembly order can 
also be achieved - despite the use of non-unique couplkgs - by using buildmg blocks having 
distinct combinations of coiiplings, and/or by stepping the assembly of the building blocks in 
a deliberately chosen sequence.- 



PANEL A. For example, one could attempt to assemble the following nucleic acid 
product using the 5 nucleic acid building blocks as shown. 



1 ^ 2 ^ 3 4 i-ElT 



PANEL B. However, degeneracy in the overall assembly order of the 5 nucleic acid 
building blocks woiild be present if the assembly process were carried out in one step. 
For example, building block #2 and building block #3 could.both couple to building 
block #1 as shown. 



1 2 Ti^nrp 
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FIGURE 4 cont 



PANEL C. However, a unique overall assembly order could be achieved by 
sequentiaUy coupling the building blocks in 2 steps (rather than all at once) as shown. 
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Figure 5. Unique Couplings Available Using a Two-Nudeotide 3' Overhang. 

16 unique 3* overhangs can be fonned using two-nuclcotides. However, use of these 16 unique overhangs 
allows for the formation of only 6 unique couplings. Another 6 unique couplings are provided by the use 
5' overhangs fonned using two-nucleotides. Thus, a total of 12 unique couplings are provided by the 
combined use of 3* and 5' two-nucleotide overiiangs. *Twin" couplings are marked in the same shading. 
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Figure 7. Synthetic genes from oligos. 
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c 
c 
c 



CCX3T 



A5faMX3CACG GCGATATTTC ATCGAGCMI GACACGGTCG GCGTTGCCGT 
ATG3ATCACG GCGACATTTC ATCGAGCAAT GACACGGTCG GCGTTCCCGl 
ATGAGACACG GAGATATCTC CAGCAGCAAC GATTGCGTGQ GCGTGGCCGT 



GAG GT 

CGTGAACTAC AAGAT GCCT C GCCTTCATAC CAAGGC^Q GOTTTAGCGA 
CGTGAACTAC AAGATGCCGC GGCTTCACAC CAAGGCtSAQ GIGCTGGCCA 
CGTGAACTAC AAGATGCCGC GGCTGCATAC CCGCGC qGAG Glj OATCGAGA 



CGG 



ACGCCAGAAA GATCGGCS^ ATGATCGTCG GCATGAAGAC 
ACTGCCGCAA GATCGCCGAC ATGCTGGTCG GCATGAAGAQ 
ACGCCCGCAA GATCGCCGAC ATGGTCGTGG GCATGAAGCG 

CCACG 



CGG^CTGeSG 

cgg :ctgccg 
cgg::ctgccc 



GGAATGGATC TGGTGATCTT CCCGGAATai TCGAC CCACG 
GGAATGGATC TGGTGATCTT CCCGGAATAT TCCAC CCACG 
GGCATGGACC TGGTCATCTT CCCCGAGTAC TCCACjcCACG 



GCATCATGTA 
GCATCATGTA 
GCATCATGTA 



CCC GG 



CGACTCCAAG GAAATGTACG ATACCGCGTC CGTCGTCCSffi GGMAGGAGA 
CGACTCCAAG GAGATGTACQ ACACGGCGTC GACGGTCCCG GG PGAAGAGA 
CGACGCCAAG GAAATGTACG AAACCGCTTC GGCCAt JcCG GGg GAAGAGA 



G GGG 



CCGAGATTTT TGCCGA AGCC TGCCGCAAGG CGAAAGT^ GGG3GTG12S 
CCGAGATTTT CGCCGAGGCC TGCCGCAAGG CCAAGGTCTG GGG2GTGTTC 
CTGCTGTGTT CGCCGAGGCC TGCCGCAAGG CCAACGTAi p GGGp GTGTTT 



AAAG C 

TCGCTCACCG GCGAACGTCA CGAGGAACAT CCGAAOAAGG OGCCCTACAA 
TCGCTGACCG GCGAGCGCCA CGAGGAGCAT CCCAAIAAAG CGCCGTACAA 
TCGCTGACGG GCGAGCGCCA CGAAGAGCAC CCGAAqAAGG^CCGTACAA 



CAGAA 

CACQCTGATC CTGATGAACG ACAAGGGCGA GGTGGTde^T^TACCGCA 
CACCCTGATC CTGATGAACG ACAAGGGTGA AGTCGTTCAG AAATATCGCA 
CACGCTCATC CTGATGAACA ACAAGGGCGA GATCGTqCAQ^^pTACCGCA 



GGTA 



AGATCATGCC GTGGGT TCCG ATCGAGGGCT 
AGATCATGCC GTGGGTGCCG ATCGAAGGCT 
AGATCATGCC CTGGGTGCCG ATCGAAGGCT 

TGAAG 

TACGTCTCCG ACGGGCCGT^ GGGC^ff^S 
TACGTCTCCG AAGGCCCGAA GGGC^ TGAAG 
TATGTGTCGG AAGGCCCCAA GGGAC TTGAAG 



GGTApCCCGG CAACTGC|^ 
GGTArCCCGG CAACTGCACG 
GGTArCCGGG CGATTGCACC 



GTTTCGCTGA TCATCTGCGA 
ATGTCGCTGA TCATCTGCGA 
ATCAGCCTCA TCATCTGCGA 



TGACGGCMS TATCCGGAAA 
CGACGGCAAC TACCCGGAAA 
CGACGGCAAT TACCCCGAGA 



TCTGGCQ 

TCTGGCGp GA CTGCGCCaiS AAGGGCGCCG 
TCTGGCGrGA CTGCGCGATG AAGGGCGCCG 
TCTGGCGCGA TTGCGCCATG CGCGGCGCCG 
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Figure 7 cont- 
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CCAG 

AGCTGATCGT GCGCXaCEE 
AACTGATCAT CCGCTGCCAG 
AGCTGATCGT GCGTTGCCAG 



GGCTACATGT ATCCGGCCAA GGACCAG£aS 
GGCTACATGT ATCCCGCCAA GGATCAGCAG 
GGATACATGT ACCCGGCCAA GGACCAGCAG 



GC 



GTCATCATGG CGAAGSaSAT GGCGTG GGCG AATAATTGTT ACGTCGCGGT 
GTGCTGATGG CGAAA3CJAT GGCGTGGGCG- AACAACGTTT ATGTCGCGGT 
GTCATGGTGT CCAAC^qirAT GGCGTGGATG AACAACGTCT ACGTGGCGGT 

GGGCTTCG 



TTCCAATS2C GOSGGCTTCa ATGGCGTCTA TTCGTATHC GGCCACTCGO 
CGCCAATGCC TC3GGCTTCG ACGGCGTCTA CTCGTATTTC GGCCATTCGG 
GGCCAATGCC GCSGGCTTCQ ACGGCGTGTA TTCCTACTTC GGCCATTCGG 



TTCGA 

CGATCATCGG CtTTCGyGGC CGCACGCTCG GCGAATGCGG CGAGGAAGAa 

CGATCATCGG CrTCGACGGC CGTACCCTCG GCGAATGCGG CGAGGAGGAT 

CGATCATCGG CrTCG;^CGGC CGCACGCTGG GCGAATGCGG TGAAGAAGAC 



C ACTA 



TACGGCATdc"AGT3rGCCCA GCTTTCG^Ag ATGCTGATCC GCGACGCCCG 
TATGGCATCC AGTATGCOGC CATCTCCAAG TCGCTGATCC GCGACGCGCG 
ATGGGCGTGCMTACGCCGA GCTCTCCACC AGCCTGATCC GCGACGCGCG 



CAATC 

CCGCACQGCa CRKTOSGAAA ACCATCTCTT CAAGCTGGIG CATCGTGGCT 
CCGCACCGGC CAATC 3GAAA ACCATCTCTT CAMCTGGTG CACCGTGGCT 
CAAGAACATG CAGTC3CAGA ACCACTTGTT CAAGCTGGTG CACCGCGGCT 



ACACCGGGTT 
ACACCGGCAT 
ACACCGGCAA 



GATCAA 

GATgAA p TCC GGCGAGGGCG ACCGCGGTCT CGCGGCQIGT 
GATCAA rrCC GGCGAGGGCG ACCGCGGTGT CGCGGCTTGC 
GATCAA FTCC GGCGAAGAGG CCACCGGCGT CGCGGCATGC 



TTA 



150ainl3_00 
150AM7_001 
431aro7 002 



150aml3_00 
X50AM7_001 
4 3 lam? 002 



150aml3_00 
150AM7_001 
431am7 002 



cdr^tgaqt tctacaacaa atggat cgcc gatccggaag gcacccgcga 
ccgtapgatt tctattcgaa atggatcgcc gatcccgagg gtacacgcga 
ccgta:aact tctacgccaa ctggatcaac gatccggagg gcacgcgcaa 



ATGGT 

;Jatgg^cgag tcctttaccc ggccgacggt gggaaccgat gaagcgccca 
gatggiggaa tccttcacgc gtccgacggt gggtgtggag gaatgcccga 
catggicgaa tccttcaccc ggtccaccgt gggcacgccg gagtgcccca 



TCGAG 

rCGSSpGCAT CCCGAACMS GTCGCGGTGC ACCGCTGA 
TCGAG 3GCAT TCCGAACAAG GCCACCACGC ACCGCTGA 
TGGAC3GCAT CCCCAACGAG GACGCCAAGC ACCGCTAG 



aagct 
aagct 
a aoct 
Hindlll 
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Figure 8. Nucleic acid building blocks for synthetic ligation gene reassenbly. 



Ncol 



* ' ^"^ 1 GGCaI i 1 ICTCCA GCd I GGTGd I GGGCCJ 



' ^^^^^'i coca ^ I TTTCGl I I IGTCTT □ iCCftT □ 



I TCTGr^ i CCAGi I I bG GGGCTTCG| I TTCga 



'I'CJ titj^JOl . I %^wi^v3i 1 J |fc2ai-i r 



I TCGRGI 1, ^^^,1 

Hindi II 
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Figure 9. Addition of Introns by Synth tic Ligation Reasseably. 



NCOl 
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Figure 10. Ligation Reassembly Using Fewer Than All The 
Nucleotides 0£ An Overhang. 



Gap Ligation 



IGGGG I IGC CAGAAl 

I TTTCGI I 



IGGGG I IGC CAGAAl 

I TTTCGI I 



Ligation of one strand only; 
gap in second strand can be repaired in vivo 
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Figure 11. Avoidance of unwanted self -ligation in palindromic 

couplings . 



'^SmS^ I I fcGG I ICCACG 1 

ICTCCA GCa I GGTGa I GGGCCi 



1"- 1 


CCGT 




1 G6CA 


1 1 



5'P 5'P 5'P 5'P 



S'P S^OU J 

t^Q , ' I No self ligation of end primers 
57^5 5? P with palindromic overhangs 



5 ;p s^P 

I TCGAGi 1 fil 

5'P 5' OH 
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Figure 12 

Site-Directed Mutagenesis by Polymerase-based Extension 




Molecule (A) Molecule (B) 
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Figure 13 

Site-Directed Mutagenesiis By Polymerase-based Extension 
and Ligase-based Ligation 




Molecule (A) Molecule (B) 
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Figure 14 

Strategy for obtaining and using nucleic acid binding proteins that facilitate 
entry of genetic vaccines. 



Evolution in M13 Format 



M13 
phage 

pVIII protein/cDNA library 



I 



Target Tissue 




Multiple cycles of 
.panning/screening 



Genetic vaccine coated for ease of entry 




Genetic vaccine (e.g. naked DNA) 



M13 pVni coating protein 

£volved ligand (fused to pVni) 
which directs DNA into cell 
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Figure 15 

A schematic representation of a method for evolving a chimeric, 
multivalent antigen that has immunogenic regions from multiple 

antigens. 
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Figure 16 



Method for Obtaining Non-Stociiastically Generated Polypeptides tliat can 
induce a Broad-Spectrum Immune Response. 



A Wild-type 
Pathogen: ABC 



Protection: 



-iXlXi 



B 



Evolved 



A/B/C Chimera 

/h 



B 



B 



Protection 



Wild-tvoe 



Pathogen: A 



B 



-iXl 



B 



Evolved 
A/BLibraiy 



B 



Figure 17 

Possible factors for determining whether a particular polynucleotide 
encodes an immunogenic polypeptide having a desired property. 
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Figure 21 

Schematic representation of a multimodale genetic vaccine vector 
(relative sizes of functional units are not drawn to scale) 



Intron 



Operator 



Enhancer 
Promoter 



Bacterial Origin 
(or serviceable 
origin in other 
recombinant host) 



Bacterial Marker 
(or serviceable 
marker in other 
recombinant host) 



Antigen(s) 




Poly-A 



Co-stimulatory 
molecules 



Mammalian 
Origin 



Inducible 
Repressor/ 
Transactivator 
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Figure 22A and 22B 
Generation of vectors with multiple T cell epitopes. 



Library of experimentally generated 
polynucleotides 



— I pTomotg~[ - | epitope | — 



— ] promoter^H epitope | — 



— I promoter} - epitope 



— promoter - epitope 

— I promoter |n epitope f — 



— I promoter"] - epitope | — 
— I promoter" - epitope | — 



^ promoter epitope 
— I promoter" - epitope | — 



i 




Library of experimentally generated 
polynucleotides 



1^ - | promoter" • — [epitope — — [ epitope | — 



epitope] — 



— I epitope I — 



— I epitope I — 



— I epitope I — 



— I epitope 



epitope J — 



— I epitope [ — — [ epitope | — 



epitope I — 
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Figure 23 

Generation of optimized genetic vaccines by directed evolution 



Q A parental polynucleotide set 
^ comprised of 1 or more 

O O gene(syvector(s) 



A) Directed Evolution 



-^OO Library of experimentally generated polynucleotides 

Oogooo 





B) Screening & selection using 
human for other mammalian'i cells 
(e.g. in vitro (e.g. well-based'^ 
or flow cvtometrv-based) to select for: 
'transfection ef&ciency 
•expression of antigen 
•activation of lymphocytes and antigen 
presenting cells 

•induction of cytokine synthesis 



C) In vivo screening for improved immune responses: 
•mouse model 
•SCID-hu mice 
•large animab 



Select desirable molecules 



Optionally subject to 1 or more rounds of 
directed evolution and selection 
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Figure 24 

Recursive application of directed evolution and selection of evolved promoter 
sequences as an example of flow cytometry-based screening methods. 



Library of experimentally generated promoters 
(e.g. derived by subjecting 1 or more naturally 
occuring CMV promoters to 1 or more directed 
evolution methods as described herein) 




1. Screen/Select optimized cells 
(e.g. by flow cytometry) 

2. Recover pool of transfected 
promoters (e.g. by polymerase- 
based amplification, DNA 
mini-preps, or other DNA 
isolation procedin^) 

3. Subject selected sequences to 
1 or more additional rounds of 
directed evolution to achieve 
further optimization 
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Figure 26 Panel A 



Non*stochastic polynucleotide reassembly in combination with 
non-stochastic polynucleotide site-saturation mutagenesis. 

Shown below is a non-limiting example of a permutation of the directed evolution 
methods described herein 



-1 1 



Parental Set comprised of 1 or more 
polynucleotide templates (e^. viruses) 



Direct Evolution (picferabiy, for exan^>le, non-stochastic polynucleotide 
reassembly and/or polynucleotide site-satutadon mutagenesis) 




ProgenltorSet#l 

Ubnuy of experimentally generated 

(eg. chimeric vinises) 



^ Screen/Select 



} 



^ Non-stochastic polynucleotide site- 



Progenitor Set # lA: Optimized molecules 
A subset of Progenitor Set # 1 comprised of the most desirable 
and/or highly optimized subset of molecules 



saturation mntagcne^ 



Progenitor Set # 2 

Ubraiy of experimentally generated 

(e.g. site mutagenized vinises) 



Screen/Select 




1 



Progenitor Set U 2A: More optimized than Progenitor Set # lA 
K A subset of Progenitor Set # 2 comprised of the most desirable 
and/or highly optimized subset of molecules 

Combine pofaat mutations and/ot 
subject to further chhcnerizations 

Progenitor Set #3 
Libraiy of experimentally generated 
(e.g. site-mutagcnizcd viruses) 



Screen/Select 



Progenitor Set # 3A: More optimhted than Progenitor Set # 2A 
A subset of Progenltoir Set # 3 connprised of the roost desirable 
and/or highly optimized subset of molecules 
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Figure 26 (continued) Panel B 

Screening of experimentally generated molecules produced by non-stochastic 
polynucleotide reassembly in combination ynth non-stochastic polynucleotide site 
saturation mutagenesis 



100-tr 
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#3A 



-Optimized molecules 
selected from Progenitor 
Subsets lA, 2A, and 3A 
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Figure 27 

Vector for promoter evolution 

Working promoter (e.g. subject 
to screening and/or 1 or more 
additional rounds of direct 
evolution) 
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Figure 28 



Iterative evolution of inducible promoters using directed evolution and flow 
cytometry-based selection. 

Library of experimentally 
eenerated promoter nucleic acids 
(e.g. derived by subjecting 1 or 
more promoters to 1 or more 
redirected evolution methods as 
described herein) 



Uninduced 
(no tetracyclin) 




Screen/Select 
cells with least 
expression 



Recover pool of promoter sequences 
(e.g. by polyraerase-based 
amplification, DNA mini-preps, or 
other DNA isolation procedure) 




Induced 

(tetracyclin added) 



Screen/Select 
cells with most 
expression 



1 



Subject selected 
sequences to 1 or 
more additional 
rounds of directed 
evolution and fl w- 
cytometry based 
screening 
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Figure 29 

The present invention provides that a genetic vaccine can be subjected to directed 
evolution in order to achieve improved effectiveness upon administration by oral, 
intravenous, uitramuscular, intradermal, anal, vaginal, or topical delivery 
methods. 

The figure below shows an exanqjle of the directed evolution of a genetic vaccine, comprised of an MI3 
phage-based vaccine, to achieve optimization for oral delivery. 
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Figure 30 

An alignment of the nucleotide sequences of two human CMV strains 
and one monkey strain. 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum aLl04 
AF078102 Rhesus 



AF026939 CMV 
AFO'47524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hvim X7L104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF07B102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



1 ___ 
(1) -HfacaEM aifl B^B^gBl Tg-^TG GAE^STA AG 

(1) --BTgBMBBEMggTATBGACCf:Ha gcB^ 

(1) ATfiGgTfllQg-B'^CCBA mcS^CSTBSESSiSBABICTTTBTGA 



51 



100 



(38) CSP^AAAB^^SiSS^CrGC^BcSCAg^i^ 
(49) TCB^ffi|^H^Gffl"CgSC 

(47) G3S'iS£|S££3^^ffi^^^'^13SSSI'^~i 



CTC 



101 150 
(87) --|^^A|CiS^TGAiGatScgA^Ari)^C$^ 

(94) B'igS^^BMBGcB[ cEEMBBcfl- -gaT8T@i^ 

(9^) J^J^^^I^T^Tj^^ 

151 200 

(133) Cffi8A8A|^Gi^TigSiAaBW|---l^B M&^A TTgA^^ 

(142) G@SSc{^-gGiGSiAg|SEci^affi^|^ciBScG|^ 

(144) TA^TBG;^CSGi|gCtl€AC3^TGCgl3A3S7-gGT^^ 



201 



250 

eg 



(180) GgSS^— ^GAgiTAaAA@A!jAGA(^ai@— t 

(191) CGg-g[Bgl{gT^(^Gg^i§CCSTCTCEEiGi#^ 

(193) TfSBgBC^C^^TTAgAgggTSTS^ j^C^^ 

251 300 

(223) A lieSTTTl^CSc^C^KIg^ 

(240) 9-gNSCGG@gg2|C@|cSGi^TCgT^^ 

(239) ScffilGfflAiC^Gi^rr^gAA^gA^^ 



301 350 
(269) rigCAl^AS;^CgTAiAl^iAAiAAC<^C;i^^ 

(289) gSAGC^lG^SSig^AggCCBSGG^^ 
(289) gCGCAiCCC|SE|'lS|cg-^|T|gTC|SiC^^ 

351 400 
(318) GgSA|ic4|GA|^Gan2 l^|C ^|AB |A@g^TGAg^GCaSS 

(337) j^^^AigrgSEP^ — Si|ABG^^|c3^«i^l@Si^^^S^'^ 

(338) jgcQglSS'^A^^G'I^TSaSICC^^ 



401 450 
(364) |®SCAS--S@^S--i5TeA^B^iAA^ 

(384) SI^^H^p^GgllScjgSSMC^SCCCC™ " ™* 

(387) 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



451 500 

(408) SSP^SttBgScS lEU^COiaTC^SG ffiESBggSU eUaSSg ATgl^@ABgA 

(432) cx:SI@|c^cl^---{SEgTga6GS;^ASSi^^ 

(435) ggBESS AAAAlBtTCCgBaABACTgCG 



501 550 

(451) f^SriSlStJbil ^^SS!&lf^3^^ 

(478) Bft<^aSB8B»'ap^BaGBc^ 

(485) 
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Figure 30 continued 



551 



600 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 h\jm UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum XJL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



(501) 
(527) 
(534) 



601 



TC 



(551) GgA^^BGCG gAC^TMTOSBP^B BiGHaM 

(571) gGGBBccG CGHSB lcBa^fflBcBSWEi 
(579) SiGSta4QS8^iB9— 




(597) 
(615) 
(627) 



651 

ACAACBCggA^TCTCC-] 
gGBG 



700 



(646) 
(657) 
(677) 



(689) 
(706) 
(725) 




751 



800 



801 

(736) ^^lAA^^^GSAgAgiGAATApgA- 
(756) ^^Glg^C^TClgdcglgC^G^ 
(775) GAA^T^GlSgAggTAlfcgTTTE^J 



850 



(784) 
(801) 
(821) 



851 900 

\(^^g^TCC^|CAglS 
__„_,JII rSG"gGg@-i 



901 



(834) iABjCAA gTTBTABABABftftgft^TGACCElG 

(847) 

(863) ^l^lgCGSi^TTgG'liiii; 




950 



951 1000 
(884) jTTCASBBGGBB'ITO^^f^^ft^^^^ 

(896) CCG^^ gCC^CG^GgGggcf^j^GAp^^BS^^GGl^Gil 

(912) gGCTgS^TSEgC^lfSSliTglfGG 



1001 



(934) 
(942) 
(961) 



(984) 
(986) 



1050 
.TACAG(^ 




(1004) B'^S^^^4^4'^ffl3&S^~E3SS39PB^11'^fl^~~ 



1101 

(1032) B'l^KiffiSG^^^^&^^HBA^ 

(1036) 
(1049) 



1150 
>TCC TC|B3 
ICCflGTCEBgi 
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Figure 30 continued 



AF026939 CMV (1082) 
AF047524 hum UL104 (1084) 
AF078102 Rhesus (1095) 



1151 ^ 1200 

c-BM- |ctAT i-BBMcAccCgaM^jEB ^^ 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



1201 1250 

(1131) BAC jATBBP TgAGGagB Mcj^ ^ -Bc|GA AAiB^BgAA Tfl -Q 

(1131) TGGHB BBcfflW — TTqjSPf&priAAAf^r^fATT fsgiia^ 
(1143) iTA*iffl|BB3B3 — BSMMcc 



AF026939 CMV (1175) 
AF047524 hum UL104 (1178) 
AF078102 Rhesus (1175) 



1251 1300 

CAgC^^-BrABrgCAAQIltBES- -BAA^TAQB^GAAGTC^SSii 



AF026939 CMV (1222) 
AF047524 h\im UL104 (1226) 
AF078102 Rhesus (1222) 



AF026939 CMV (1266) 
AF047524 hum UL104 (1274) 
AF078102 Rhesus (1267) 



1301 



1350 



1351 1400 

AAgCAACgGAl^a-ESlGgGATCAgA B^C 

(^BTCGGgcBc8c-gTg(^|§cGGgiga^aiM|c^|3^^ 



AF026939 CMV (1315) 
AF047524 hum UL104 (1323) 
AF078102 Rhesus (1317) 



1401 



1450 



AF026939 CMV (1361) 
AF047524 hum XXL104 (1367) 
AF078102 Rhesus (1356) 



^Tffi- 1 



1451 



1500 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



1501 

(1411) 
(1413) 
(1403) i' 



1550 

iABGAB^I^GCC|^0--Ta;^GGAgBcgi ^gA|^ ^ 

IT-B 



AF026939 CMV (1457) 
AF047524 hum UL104 (1462) 
AF078102 Rhesus (1451) 



1551 



1600 

■BdEcBcTCGGcS 



1601 



1650 



AF026939 CMV (1506) 
AF047524 hum UL104 (1508) 
AF078102 Rhesus (1501) 



AF026939 CMV (1552) 
AF047524 hum UL104 (1553) 
AF078102 Rhesus (1547) 



AF026939 CMV (1600) 
AF047524 hum t7Ll04 (1602) 
AF078102 Rhesus (1587) 



1651 



T-gAJf^ A^^CGGg 

1700 
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Figure 30 continued 



AF026939 CMV (1649) 
AF047524 hum UL104 (1644) 
AF078102 Rhesus (1632) 



AF026939 CMV (1696) 
AF047524 hum UL104 (1686) 
AF078102 Rhesus (1676) 



1751 

eg TG BPff lfflljCT fli^BBl 



1800 

\r 



—Ho 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



AF026939 CMV 
AF047524 hum UH04 
AF078102 Rhesus 



1801 1850 

ri^lnflAABAG BTGgarign KaBlcGSGKK^B ^ — SSSB 

BcBccSGGTCCS3SS^Ci@G^C^TSG|^BB2^GgC6CQ|GC 

Ea- --grifrrTABqGaBGtei^bJgaASi-ffiCT gEB^ 



1851 



AF026939 CMV (1744) 
AF047524 hum UL104 (1736) 
AF078102 Rhesus (1718) 



AF026939 CMV (1793) 
AF047524 hum UL104 (1786) 
AF078102 Rhesus (1760) 



AF026939 CMV (1843) 
AF047524 hum UL104 (1836) 
AF078102 Rhesus (1809) 



1900 

|GB&TCACS[AS<^TC^^T-g 



1901 



1950 



1951 



[tBcaI 

2000 



GC||crifG(^G|aii:^i|gTg(^gcg;^(^-- 



2001 
(1893) AG^TGgG 
(1882) 
(1853) 



(1940) 
(1931) 
(1892) 



2050 



(1984) 
(1981) 
(1932) 




2151 



AF026939 CMV (2032) 
AF047524 hum UL104 (2031) 
AF078102 Rhesus (1979) 



2200 



AF026939 CMV 
AF047524 hum UL104 
AF078102 Rhesus 



2201 2214 

(2057) 

(2081) TGGTTTCGCTCCAT 
(1999) 
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Figure 31 

An alignment of IL-4 nucleotide sequences from 3 species 
(human, primate, and canine). 



50 



AF187322 Canis IL-4 
NIC000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
10^000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
Ni$_000589 Homo sapien IL-4 
U19838 Cercocebxis IL-4 



AF187322 Canis IL-4 
NK_0 00589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
NM_000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322- Canis IL-4 
NML000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
NM_0 00589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
NH_000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



(1) 
(1) 
(1) 



gcsicAciasEEi^sa 



100 




(201) 
(199) 
(134) 



(250) 
(249) 
(184) 



(300) 
(299) 
(234) 



(319) 
(349) 
(284) 



301 



350 



351 



400 



AF187322 Canis IL-4 
NH_000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
NJ1_000589 Homo sapien IL-4 
U19838 Cercocebus IL-4 



AF187322 Canis IL-4 
NH_000589 Homo sapien IL-4 
U19638 Cercocebus IL-4 



(346) 
(399) 
(334) 




451 



500 



(393) 
(449) 
(384) 



501 



550 
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Figue 31 continued 

551 6 00 

AF187322 Canis IL-4 (493) lAC BB^Bj^^^g ^ffl^ ttWITOgfl - P™^^ Zj^^^ 

KML.000589 Homo sapien IL-4 (549) f^.-)tMfat=faaMi™AlilhW^}i'^ 

U19838 Cercocebus IL-4 (464) 

601 637 

AF187322 Canis IL-4 (536) BB- ^^^^^^ C l^^ TATAAAAAAAAAAAAA 

Nll_000589 Homo sapien IL-4 (598) Sfff^ca»^iBmgiMa«Atpi^8ri^- 

U19838 Cercocebus IL-4 (464) r — 
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Figure 32 

Evolution of polypeptides by synthesizing (in vivo or in vitro) corresponding 
deduced polynucleotides and subjecting the deduced polynucleotides to directed 
evolution and expression screening subsequentiy expressed polypeptides. 

Genomic DNA or cDNA library 

Expression screen library products 
(e.g. polypeptides expressed by genes 
in the library) 



1. 
2. 
3. 



1 



Polypeptides (or other gene prodocts) 
to be evolved 

E.g. polypeptides or gene 
products. prorootcTB, etc. 



Align polypeptides 



1. 

3* if. 



i 



Aligned Polypeptide Sequ 
Consensus amino adds are boxed. Alignments 
^ can be perfonned. eg., using software such as 
Vector NTI ™ (bifomax Inc.) of MACAW 
(Greg Schulcr, NCBl NLM, NM). 



Determine deduced coding 
sequences using the same codon for 
each consensus amino acid 



1. 
2. 
3. 



Aligned Potynadeotide Sequences 
Consensus nucleotide bases are boxed 



Subject to direct evolution by: 

1) Non-stochastic polynucleotide 
reassembly; and/or 

2) And/or non-stochastic site-saturation 
mutagenesis 



Libraiy of experimentally 
generated poiypepUdes 



i 



Expression Screen 



Optionally repeat 
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Figure 33 

Directed evolution of polynucleotides (e.g. promoter sequences) 

This figure shows an example of the application of non-stochastic site-saturation mutagenesis 
in combination with non-stodiastic reassembly (e,g. oligo-directed CpG deletion(s) and/or 
addition(s)) 



i 



Design oligos which each delete 
and/or add in 1 or more of the CpGs. 



-XX- -XX- 
— @- —XX— —XX— —XX— 



i 



Site- saturation mutagenesis 

(optionally in combination with non- 
stochastic gene reassembly) 



■} 



-> 



—XX— 



Parental set comprised of 1 or more 
promoter sequences (natural and/or 
experimentally generated), each of 
which has a plurality of CpG 
motifs, some of which are essential 
for function, others which 
eventually cause shut-down of the 
promoter 

Parental set 



Oligos 



) : CpGs to bd introduced experimentally 




Progenitor Set#l 
Library of experimentally 
generated promoters 



Screen for promoters that are functional 
and do not lead to shutdown in cells. 



Progenitor Set # lA 
Optimized 



CG I CpGs that appear to be beneficial, 

essential, or norweplaceable (in the context ot 

aQ other mutations, If any) 
XX ; CpGs which could be replaced with the 

selected sequence (in the context of aU other 

mutattom, If any) 

1 00 1 : CpGs that may be beneficial (or have neutral 
effects) when added In (In the context of aU 
ofther mutations, H any) 
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Figure 34 

An example of a CTIS obtained from HbsAg polypeptide (PreS2 plus S regions). 
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Figure 35 

An example of a CTIS having heterologous epitopes attached to the cytoplasmic 
portion. 
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Figure 36 

Method for preparing immunogenic agonist sequences (IAS). 



WT sequence 



Mutated 



Assembly (+/- screen) 



Reassembly {-H- screen) 



)( )( )( 



)( )( 



)( )( )( 



Poly-epitope region containing potential agonist sequences 



i 



Non-stochastic site saturation mutagenesis 
(+/- screen) 



Additional library of 
progeny molecules 



Further optimized poly-epitope region containing potential agonist sequences 



Direct evolution (+/- screen) 
Repeat as desired 
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Figure 37 

Improving Immunostimulatoiy Sequences (ISS) Using Directed Evolution. 



Assembly 



Oligonucleotide building blocks 
^ (e.g. synthetically generated), oligos with 
known ISS containing hexamers, poly A, C, G, T, 
and other polynucleotides 



Clone into a vector, generate a library 
(by directed evolution) 



° °o° o ° 




In vivo studies: 
-Mice ^ 
-SCID-hu mice 
- large animals 
-human 




Screening fon 

- enhanced cytokine synthesis by human PBL 

- improved activation of human B lymphocytes 
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Figure 38 

Screening to identify TLrU genes that encode recombinant IL-12 having an 
increased ability to induce T Cell proliferation. 



Worldng Progenitor templates 

Library of IL-12 genes 
(p35/p40 fusions) 

1) Directed Evolution 2) Express in bacterial host 



Bacterial colonies 




8) Optionally repeat steps 1-7 

7) Identification and selection 
of clones inducing most potent 
T cell proliferation 



3) Robotic colony picking 
(one colony/well) 







6) Transfer of supematants 
to human T cell cultures 



96wellsX50 

4) High throughput plasmid purification. 
_ (e.g. PEEyRECT prep-96 kit) 




5) Transfection to CHO cells 

^ 
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Figure 40 

Screening to identify CD80/CD86 chimeric genes having an improved capacity to 
to induce T Cell activation or anergy. 



1) Directed Evolution 

library of Working 
Progenitor templates of 

CD80 &/or CD86 genes 2) Express in bacterial host 



Bacterial colonies 




8) Optionally repeat steps 1-7 

7) Identification and selection 
of clones inducing most potent 
T cell activation or anergy 



3) Robotic colony picking 
(one colony/well) 





96 wells X 50 



6) Co-culture vwth 
T cell cultures 



4) High throughput vector purification, 
(e-g. PERFECT prep-96 kit) 



5) Transfection to dendritic ceils/ 
U937 cells 
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Figure 41 

Figure 41. An alignment of two CMV*derived nucleotide sequences from 
human and primate species. 



50 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Tov?ne 



AF078102 FLhesus 
M67443 Towne 



(1) ATCGATTTAAACTGCCCGATTGAGG' 
(1) 



lT7flTBGCTEP4|CABhflTGSAES9- 

- -BcBATGBc|TcB(^c4aGgS[[lc 



51 
(50) - 
(24) 



^ABATABBcBCTGCfflflCCQAAi' 

:|TGCfSP4|GCCGSGSTlSh^BcBGBI^ 



100 



101 150 
<99) B<#?'rACATATgBAA EM G j^E8 lfl;^CfflgSk^G B^ 
(73) SMI 



151 200 
(148) g-^^C^AC|Gg-|TB'ZQA<^TTG|G|A^ 
(114) 



201 



250 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
1167443 Towne 



.TAC< 



(196) 
(164) 



251 300 

(242) glG- Mfltejaa ftfiU^ATaiTAAASAGSATfel^^ 

(214) iAC^rai^K3Bci<X:C^TG<^GA|CGg^CgGT8Ai Hi^ffi 

301 350 

(291) G pjfaBfibB^ TGTAg jig i im CT^ 
(257) 



351 400 
(341) ijBGTgGASA^GgH8^rilB<acgABgT-BTT^Tp^ 
(285) 



401 450 
(389) T|gG3AGCAgTGGgAS5ATTGg<^TGSk3ga^ 
(334) CE^AgcATCBca^glcSSICAACt^ 



451 



500 



(438) ^ — 

(363) P^f^ dlCCGABCCTficaCGTAGCaiaACekgGteAS^ 

501 550 
(488) TT'lfiTGT:ScSB9TAlfiAgTGa---GEiA®G3^^ 
(432) CAAgCAGgl^^GCi^GgGC^ICTC^G^TgSS^^ 



551 600 
(534) S-Sc^TC^CT^gpUgAgCTCSK^SEp^^^ 
(482) g|ciGgAG|iACg|GTgG§AA-gAG^ 

601 650 
(582) CG@AS;^E|GCCATSGBAS|CtK&4h^TC43S^ 

(530) TcBSGfiSB B|ABc8gc^;q|c4^3G--BSh^^ 

651 700 

(632) l&ApT)BkSAlBfc^ftfflTfflMlG}|BT|HS(a^ 
(573) BGU^-ttACH8cxflGM-l|jB-Mcagfl^^ 
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figure 41 continued 



AF078102 Rhesus 
Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
K67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
H67443 Towne 



701 750 
682) ffiAAflACdSCAGABaAGTTTBaTGGgSBlffiBcE^ 
620) i--BCAGBTGAlMak3TGACtgGTi ^Mk ffltaE!teTACCBS3G^ 

751 _ 800 

731) BcacJUifi^GAEksMGATCcHTT SM cC^^ 
668) Bcfl;AsApEta*-BcBBTCCGGflA ftBME!B^ ^ 

801 850 
779) CT MBBto g^TGTA'ltBCTTTBcBAgARTB^^ 
717) T Ggjgea GGgataAGGAaiKSACa^^ 

851 900 
629) TBTS--5TABAGTAlf|TTAAgCASS^'I^C3^tTG@GS 

767) GgcficcicGgGCGajibGG c iS^ n CE B BtoTGT ^^ 

901 950 
871) --BGl^gTT|STGT^-^C^-|C^73c^|AA 
817) AT 



951 1000 
917) ggcrigTSGSASAASIICTgT^lGAi^g^ 
867) SiACgA|c^--S|GG§(^SCT-gT|S4K^^^^^ 

1001 1050 
966) m^l&B fS^ QAC^^ 

914) l|C^Q3TCAgGTgAgCTAT^A£GAA^GGCg^^ 



1051 



1100 



1014) 
962) 



1064) 
1000) 




1151 1200 
1109) gftl^hXSBcSGTAgGCAAgaTtlfllfe^^ 
1048) 



1201 . 1250 

1159) TA|iGCCp^gGCTC^3aig[GT(^T-S|GJ^^CTC^GC:4gg|6'lSAT 
1096) AT|^--gGg^GCT|^TAC|S|A(jgSA<^BS^ 



1251 1300 
1208) CgC^^CTgATGAl^iAAATAqgl^C^TSGg^^ 
1142) i^(#ragCGCCCgGGGCGi^C^G^(^S^CCAgCG(^^ 



1301 



1350 



1258) 

1192) ^SfdB\AGC^i^^Sh^-W ^ c^^ 

1351 
1308) @AB 

1241) Q<^BcBB^ & 



1400 
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Figure 41 continued 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



1401 1450 
(1355) jgSBlGgTCaGPTAAfiSBABRAARBfflBCTGAl^ 
(1286) i^P C&xa-icCTB8aBcaCG<^BGCACGicGGGCiahaB^^ 

1451 1500 

(1405) A-BftBAGSGgl T T TT C iBja CAGA a^ 
(1335) CCgCgTTiABGCCGAGBgS ISSgQCGCgCCE 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



1501 

(1454) ^lt|C;4ABA(4GB|G(4SGATGATG§I@C{ 

(1381) SiSG(3^rTjScB|a4cB|~|gTcc~ 



1551 

(1504) Tl^^ATCA^C^TgnflT 

(1425) 



1550 

:c 

ITGTTCgBBTG 
1600 



AF078102 Rhesus 
M67443 Towne 



1601 

(1554) AtlSTTASGjl' 
(1475) CfAiGG' 



1650 



[GgGAgTCAS-gTCO^—ScEl-gAgAT^CgA 
.Gg^^ZAC^AGgl^TtrgGC^GSC 



AF078102 Rhesus 
M67443 Towne 



1651 1700 
(1600) TC^l6cl|CABCCi|AGABBT;^G|^ATgAgC 
(1525) 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



1701 1750 

(1648) @;^C^|GAgC^TT;^CAATSA8iiSGAAT|C^AiTi^ 

(1571) CCGCTGCGC|gCS(^SACgli^CGCiACCG8cS^CGCCi-|GCg^ 

1751 1800 
(1689) iG|G^^---||GA^-®T;^CiTATGAAAgA^G§CATTGTTfC^ 
(1620) 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
M67443 Towne 



1801 

(1735) gGAAgCSgCSp^Ci gl^T^C 

(1670) gCG 



1850 

iTCCAgJCgGSCi^gC^ 



1851 

(1779) G pggB GATTAAftffltA' 
(1720) AgmiTGGCCGCgS 



1900 

^TlfAAgTA|MAj|Scl^T--C4gA 



AF078102 Rhesus 
M67443 Towne 



1901 



1950 



(1827) 
(1767) 



SCTCCCA< 



AF078102 Rhesus 
M67443 Towne 



1951 2000 
(1877) -aA!|G5*AlgAgTTGTGAAAG^iC^GTgAAS|A$CATGC^l^^ 
(1814) TgCgcSGAlc^CCCCC ^C^jTCgCGgcfTCGCC^TGjpC^^^C^T 



AF078102 Rhesus 
M67443 Towne 



AF078102 Rhesus 
H67443 Towne 



2001 2050 
(1926) G j^a GG<aSSTTBW8ClgTgTT^ 
(1860) Cj^g S^CQSGgG/^ElGgACCgSC^SICGC^^ 

2051 2077 

(1975) TGgTTBft 8Pj|i g'i''lt|iSiTl'CigA 

(1907) GAaGCBC^BBfcBGCP-fiGCGGAgGCTT 
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Figure 42 

Figure 42: An alignment of the IFN-ganmia nucleotide sequences from 
human, cat, rodent spedes. 



AF081502 Marmota monax IFN-gamana 
030619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AF081502 Marmota monax IFN-ganona 
D30619 Felis cat\:s IFN-gamma 
X87308 Homo sapien IFN-gamma 



AF081502 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AFO 81502 Marmota. monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AFO 8 1502 Marmota monax IFN-gamma 
D30619 Felis catus IFTa-gamma 
X87308 Homo sapien IFN-gamma 




(181) 
(199) 
(100) 



251 



300 



AF081502 Marmota monax IFN-gamma (231) 
D30619 Felis catus IFN-gamma (249) 
X87308 Homo sapien IFN-gamma (150) 




AF081502 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AFO 8 150 2 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AFO 8 1502 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AFO 8 150 2 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X87308 Homo sapien IFN-gamma 



AF081502 Marmota monax IFN-gamma 
D30619 Felis catus IFN-gamma 
X8730B Homo sapien IFN-gamma 



301 . 350 

(279) -g; «E|jMB!! gffl^^ 
(299) 
(198) -S 




(428) 
(449) 
(347) 



(478) 
(499) 
(397) 



551 



569 



AF081502 Marmota monax IFN-gamma (528) 
D30619 Felis catus IFN-gamma (549) 
X87308 Homo sapien IFN-gamma (439) 



IcQQBBBaatatttg 



50 / 50 



